How can Google App Engine be prevented from immediately rescheduling tasks after status code 500? - google-app-engine

I have a Google App Engine servlet that is cron configured to run once a week. Since it will take more than 1 minute of execution time, it launches a task (i.e. another servlet task/clear) on the application's default push task queue.
Now what I'm observing is this: if the task causes an exception (e.g. NullPointerException inside its second servlet), this gets translated into HTTP status 500 (i.e. HttpURLConnection.HTTP_INTERNAL_ERROR) and Google App Engine apparently reacts by immediately relaunching the same task again. It announces this by printing:
Web hook at http://127.0.0.1:8888/task/clear returned status code 500. Rescheduling..
I can see how this can sometimes be a feature, but in my scenario it's inappropriate. Can I request that Google App Engine should not do such automatic rescheduling, or am I expected to use other status codes to indicate error conditions that would not cause rescheduling by its rules? Or is this something that happens only on the dev. server?
BTW, I am currently also running other tasks (with different frequencies) on the same task queue, so throttling reschedules on the level of task queue configuration would be inconvenient (so I hope there is another/better option too.)

As per https://developers.google.com/appengine/docs/java/taskqueue/overview-push#Java_Task_execution - the task must return a response code between 200 and 299.
You can either return the correct value, set the taskRetryLimit in RetryOptions or check the header X-AppEngine-TaskExecutionCount when task launches to check how many times it has been launched and act accordingly.

I think I've found a solution: in the Java API, there is a method RetryOptions#taskRetryLimit, which serves my case.

Related

GAE, This is likely to cause a new process to be used for the next request to your application

I use GAE standard environment and cron service. I deploy my app and cron jobs successfully based on app.yaml and cron.yaml.
And, when I run cron jobs firstly. I got below logs from stackdriver:
This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.
A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application.
How do I solve this? thanks.

Long-running script on Google App Engine

I'm attempting to create a microservice on Google App Engine that is not intended to handle HTTP requests.
Instead, I was hoping to have a continuously running Python script that monitors a remote queue--RabbitMQ, to be precise--and sends out an api-call to another service as tasks are pushed to the queue.
I was wondering, firstly, is it possible to run a script upon deployment--one that did not originate with a user action/request?
Secondly, how would I accomplish this?
Thanks in advance for your time!
You can deploy your "script" as a manually scaled module -- see https://cloud.google.com/appengine/docs/python/modules/ -- with exactly one instance. As the docs say, "When you start a manual scaling instance, App Engine immediately sends a /_ah/start request to each instance"; so, just set that module's handler for /_ah/start to the handler you want to run (in the module's yaml file and the WSGI app in the Python code, using whatever lightweight framework you like -- webapp2, falcon, flask, bottle, or whatever else... the framework won't be doing much for you in this case save the one-off routing).
Note that the number of free machine hours for manual scaling modules is limited to 8 hours per day (for the smaller, B1 instance class; proportionally fewer for larger instance classes), so you may need to upgrade to paid-app status if you need to run for more than 8 hours.
Like #brant said, App Engine is designed to handle HTTP requests. It's not a perfect fit for background jobs, unless you try to wrap your logic into one http request.
Further, App Engine will emit an error when the response timeout, depending on your scaling settings. If you want to try it, consider basic or manual scaling.
For this type of workload, I would suggest you use a VM.
I think there are a few problems with this design.
First, App Engine is designed to be an HTTP request processor, not a RabbitMQ message processor. GAE is intended for many small requests, not one long-running process.
Second, "RabbitMQ should not be exposed to the public internet, it wasn't created for such use case."
I would recommend that you keep the RabbitMQ clients on the same internal network as the RabbitMQ broker, and have the clients send HTTP requests to App Engine.

How do I run a cron job on Google App Engine immediately?

I have configured Google App Engine to record exception with ereporter.
The cron job is configured to run every 59 minutes. The cron.yaml is as follows
cron:
- description: Daily exception report
url: /_ereporter?sender=xxx.xxx#gmail.com # The sender must be an app admin.
schedule: every 59 minutes
How to do I run this immediately.
What I am trying to do here is simulate a 500 HTTP error and see the stack trace delivered immediately via the cron job.
Just go to the URL from your browser.
You can't using cron. Cron is a scheduling system, you could get it to run every minute.
Alternately you could wrap your entire handler in a try/except block and try to catch everything. (You can do this for some DeadlineExceededErrors for instance) then fire off a task which invokes ereporter handler, and then re-raise the Exception.
However in many cases Google infrastructure can be the cause of the Error 500 and you won't be able to catch the error. To be honest you are only likely to be able to cause an email sent for a subset of all possible Error 500's. The most reliable way probably be to have a process continuously monitor the logs, and email from there.
Mind you email isn't consider reliable or fast so a 1 min cron cycle is probably fast enough.
I came across this thread as I was trying to do this as well. A (hacky) solution I found was to add a curl command at the end of my cloudbuild.yaml file that triggers the file immediately per this thread. Hope this helps!
Make a curl request in Cloud Build CI/CD pipeline

GAE behavior when relocating an application to another server

Two questions:
Does Google App Engine send any kind of message to an application just before relocating it to another server?
If so, what is that message?
No it doesnt. It doesnt relocate either, old instances keep running (and eventually stop when idle for long enough) while new ones are spawned.
There are times when App Engine needs to move your instance to a different machine to improve load distribution.
When App Engine needs to turn down a manual scaling instance it first
notifies the instance. There are two ways to receive this
notification. First, the is_shutting_down() method from
google.appengine.api.runtime begins returning true. Second, if you
have registered a shutdown hook, it will be called. It's a good idea
to register a shutdown hook in your start request. After the
notification is issued, existing requests are given 30 seconds to
complete, and new requests immediately return 404.
If an instance is
handling a request, App Engine pauses the request and runs the
shutdown hook. If there is no active request, App Engine sends an
/_ah/stop request, which runs the shutdown hook. The /_ah/stop request
bypasses normal handling logic and cannot be handled by user code; its
sole purpose is to invoke the shutdown hook. If you raise an exception
in your shutdown hook while handling another request, it will bubble
up into the request, where you can catch it.
The following code sample demonstrates a basic shutdown hook:
from google.appengine.api import apiproxy_stub_map
from google.appengine.api import runtime
def my_shutdown_hook():
apiproxy_stub_map.apiproxy.CancelApiCalls()
save_state()
# May want to raise an exception
runtime.set_shutdown_hook(my_shutdown_hook)
Alternatively, the following sample demonstrates how to use the is_shutting_down() method:
while more_work_to_do and not runtime.is_shutting_down():
do_some_work()
save_state()
More details here: https://developers.google.com/appengine/docs/python/modules/#Python_Instance_states

'Version is not ready' error on update - GAE Python

I am unable to update my frontends nor my backends. I get the error message 'Version is not ready'. This bug has persisted for coming up to 24 hours now. I have a task perpetually running in a queue. My best guess is that this task is stopping the update. I am unable to delete the task as it is perpetually running, nor can I delete the queue as I am unable to upload a new queue.yaml definition. The same task previously failed due to a maximum recursion error as I had a synchronous RPC within an asynchronous tasklet.
I'm pretty sure the fix will require someone from the GAE side forcibly resetting the task queue. Thus, this question would be more suitably directed to the GAE team with details about my app in a less public forum. Though, from what I can see, they do not allow direct support questions and suggest posting the question here. My follow up question, then, is when you have a GAE issue that requires action from the GAE team - how do you get hold of them (other than paying US$500/month for a premium support account)?
EDIT:
The task is/was meant to be running on a backend instance. I intended to shutdown all backend and frontend instances via the console assuming that they would cancel the task and restart themselves. But I found that only one frontend instance was running - no backends. After shutting down that frontend instance, the dashboard has reported that I have 0 instances running, yet the website is still serving and the task remains perpetually running.
EDIT:
Disabling the app stopped the task from running. After reenabling the app, I was able to update it. Though I am left with a ghost task in my queue.
If you have a stuck task queue job, I'd try disabling the queue and killing the instance running that job. If that doesn't work, I'd try disabling the app temporarily.

Resources