Asynchronous requests in AppEngine - google-app-engine

I'm building an app that essentially does the following:
Get the user to enter certain parameters.
Pass those params to the backend and start a task based on those
params.
When the task is complete redirect the user to another page showing
the results of the task.
The problem here is that the task is expected to take quite long. I was thus hoping to make the request asynchronous. Does appengine allow this ?
If not, what are my options ? I was looking at the documentation for task queues. While it satisfies part of what I'm trying to do, I'm not very clear on how the queue notifies the client when the task is complete, so that the redirect can be initiated.
Also, what if the results of the task have to be returned to the calling client itself ? Is that possible ?

You can't (shouldn't really) wait for completion, GAE is not designed for that. Just launch the task, get a task ID (unique, persisted it in the app) and send the ID back to the client in the response to the launch request.
The client can check, either by polling (at a reasonable rate) or simply on-demand, that status page (you can use the ID to find the right task). You can even add a progress/ETA info on that page, down the road if you so desire.
After the task completes the next status check request from the client can be redirected to the results page as you mentioned.
This Q&A might help as well, it's a very similar scenario, only using the deferred library: How do I return data from a deferred task in Google App Engine
Update:
The Task Queues are preferable to the deferred library, the deferred functionality is available using the optional countdown or eta arguments to taskqueue.add():
countdown -- Time in seconds into the future that this task should run or be leased. Defaults to zero. Do not specify this argument if
you specified an eta.
eta -- A datetime.datetime that specifies the absolute earliest time at which the task should run. You cannot specify this argument if
the countdown argument is specified. This argument can be time
zone-aware or time zone-naive, or set to a time in the past. If the
argument is set to None, the default value is now. For pull tasks, no
worker can lease the task before the time indicated by the eta
argument.

Related

Programatically listing and sending requests to dynamic App Engine instances

I want to send a particular HTTP request (or otherwise communicate a message) to every (dynamic/autoscaled) instance which is currently running for a particular App Engine application.
My goal is to trigger each instance to discard some locally cached data (because I have just modified the underlying data and want them to reload it).
One possible solution is to store a value in Memcache, and have instances check this each time they handle a request to see if they should flush their cache. But this adds latency to every request.
Another possible solution would be to somehow stop all running instances. No fixed overhead, but some impact while instances are restarted.
An even less desirable solution would be to redeploy the application code in order to cause all instances to be stopped. This now adds additional delay on my end as a deployment takes some time.
You could use the management API to list instances for a given version, but I'd suggest that you'd probably want to use something like the PubSub API to create a subscription on each of your App Engine instances. Since each instance has its own subscription, any messages sent to the monitored queue will be received by all instances.
You can create the subscription at startup (the /_ah/start endpoint may be useful), and then delete it at shutdown (using the /_ah/stop endpoint).

why does the default task queue of google app engine gets executed endlessly?

I am importing contacts from a CSV file, and using the blobstore service of the google app engine to save the blob and i send the blobkey as a parameter to the task queue url. So that the task queue url can use the blob key to parse the CSV file and save it in the datastore.
This here is my java code for creating a task queue.
Queue queue = QueueFactory.getDefaultQueue();
queue.add(TaskOptions.Builder.withUrl("/queuetoimport").param("contactsToImport", contactsDetail));
The Task queue actually gets executed but it does not end. It endlessly keep on saving the same contact to the datastore until i manually delete it.
What could be the reason.
This is done for error recovery. Suppose, for example, that your task was fetching a JSON feed from the network, parsing it, and storing it in a database... in the event that the network connection failed, timed out, etc. or the feed that was returned happened to be bad temporarily and failed to parse or any other number of intermittent, probabilistic sources of failure, this automatic retrying behavior (with exponential back off) would ensure that the task eventually completed successfuly (assuming that the failure is one that could be fixed by retrying and not a programmer error that would guarantee failure each and every time). The HTTP status code of the task is used to determine how the task completed (successfully or unsuccessfully) to determine if it needs to be retried. If you don't want the task to be retried, make sure it completes succesfully (and lets App Engine know about it by using a success status code, which is any of the 2xx-level codes).
If you consider the contacts example, ensuring that the contact is saved (even if there is a temporary glitch in the task handler for it), is much better than silently dropping user data.

How can Google App Engine be prevented from immediately rescheduling tasks after status code 500?

I have a Google App Engine servlet that is cron configured to run once a week. Since it will take more than 1 minute of execution time, it launches a task (i.e. another servlet task/clear) on the application's default push task queue.
Now what I'm observing is this: if the task causes an exception (e.g. NullPointerException inside its second servlet), this gets translated into HTTP status 500 (i.e. HttpURLConnection.HTTP_INTERNAL_ERROR) and Google App Engine apparently reacts by immediately relaunching the same task again. It announces this by printing:
Web hook at http://127.0.0.1:8888/task/clear returned status code 500. Rescheduling..
I can see how this can sometimes be a feature, but in my scenario it's inappropriate. Can I request that Google App Engine should not do such automatic rescheduling, or am I expected to use other status codes to indicate error conditions that would not cause rescheduling by its rules? Or is this something that happens only on the dev. server?
BTW, I am currently also running other tasks (with different frequencies) on the same task queue, so throttling reschedules on the level of task queue configuration would be inconvenient (so I hope there is another/better option too.)
As per https://developers.google.com/appengine/docs/java/taskqueue/overview-push#Java_Task_execution - the task must return a response code between 200 and 299.
You can either return the correct value, set the taskRetryLimit in RetryOptions or check the header X-AppEngine-TaskExecutionCount when task launches to check how many times it has been launched and act accordingly.
I think I've found a solution: in the Java API, there is a method RetryOptions#taskRetryLimit, which serves my case.

How to interrupt current request, Google App Engine?

My website on GAE-Python has a function to calculate some math using Evolutionary Optimization Algorithm, which will be called by an ajax request when the user click a button. Each request usually takes very long time to finish calculation.
I need some way (ajax or other methods) to tell the server to cancel the current request rather than using ajax's xhr.abort() function which does not stop the calculation on server side.
For an early attempt, I have found that GAE has the Request Timer in which the DeadlineExceededError will be raised by the runtime if the request takes too long to finish.
Based on this idea, I would like to ask if there is a way to send a signal to the server to cause the runtime to trigger an interrupt on the request?
You shouldn't be trying to do any long-running tasks synchronously in a handler. This is the perfect candidate for a task queue. The Ajax request should simply push the task onto the queue, and App Engine will process it offline. Tasks get a ten-minute timeout.
You can use memcache or the datastore to pass information to and from your Ajax code. For instance, the task handler could check memcache every few seconds for the existence of a 'stop_processing_FOO' key (where FOO is a unique ID generated by the Ajax when the task is first triggered), and the your 'cancel' button would call a handler to insert that key into memcache.
Similarly, the task could put a 'finished_processing_FOO' key with the associated values into memcache when it finishes, and your Ajax could periodically poll a handler that checks if that key is present, and return the value if so.

Google App Engine - Cron or Task Queue?

I'm building a simple "play against a random opponent" back-end using Goole App Engine. So far I'm adding each user that wants to play into a "table" in the Datastore. As soon as there are more than 1 player in the Datastore, I can start to match them.
The Schedule Tasks with Cron looked promising for this work until I saw that the lowest resolution seems to be minutely. If there are plenty of players signing up I want them to be matched quickly and not have to wait a whole minute (worst case).
I thought about having the servlet that recives the "play against random opponent" request POST to a Task Queue that would do the match making, but I think this will lead to a lot of contention when reading from the Datastore and deleting the enteties from the "random" table after they have been matched?
Basically I want a single worker that will do the matching, and I want to signal this worker from time to time that now is a good time to try to match opponents.
Any suggestions on what would be the right course of action here?
You can guarantee exclusive access via transactions:
Receive a request to play via REST. Check (within a transaction) if there is any request in database.
If there is, notify both users to start the play and delete request (transactionaly) from database.
If there isn't, add it to the database and wait for the next request.
Update:
Alternativelly you can achieve what you need via pull queue. Same scenario as above, just instead of datastore you'd check if there is a task in the pull queue, retrieve if there is or create a new one if there isn't one.

Resources