AppEngine - What's the timeout for Node cloud tasks handlers? - google-app-engine

I have an application that does some work in background, using default Cloud Tasks for scheduling/executing the process.
I would like the job to be able to run for a few minutes, or at least understand what the actual limitations are and what I can do about them.
According to docs on Push Queues (which seem to be equivalent to the modern Cloud Tasks?), the deadline is 10 minutes for auto-scaling, and 24 hours for basic scaling.
However, my job seems to crash after 2 minutes. 115 seconds is fine, 121 seconds is a crash. The workload and resource consumption is the same in all cases. The message is always the unhelpful "The process handling this request unexpectedly died. This is likely to cause a new process to be used for the next request to your application. (Error code 203)".
It does not matter if I use an auto-scaling F2 instance, or basic-scaling B2. It gets terminated after 2 minutes.
According to docs on Node request handling, there is a 60-second timeout for "request handlers"
What is the timeout in the end? Is it 1 minute, 2 minutes, or 10 minutes? Is there anything I can do to change it, if I want my job to run for 5 or 30 minutes.

In short summary, I think the best deduction that can help your scenario is Node's Request Timeout which has exactly 2 minutes timeout by default
In Long, after reading your question. I decided to create PoC out of it
created the Dummy Node 8 Service which only uses a built-in HTTP server
created a URL path that can have an artificially long response (using setTimeout) and can specify the duration from the request (e.g. /lr/300 means it gonna response approximately in 5 minutes)
deployed it to GAE service another than default (Node8, Automatic Scaling)
created Cloud Tasks "task" that request /lr/540 to the aforementioned service
Before:
Before
As you can see, the Cloud Tasks and App Engine have problems waiting longer than 2 minutes, and have the same unhelpful message that you got (The process handling this request unexpectedly died...)
And then: Code
I wrote this line in order to increase Global Request Timeout
And the result: Result
In my case, I can safely say that it's Node Request Timeout that causes the problem. I hope this can be of use to you too.

Related

NDB query().iter() of 1000<n<1500 entities is wigging out

I have a script that, using Remote API, iterates through all entities for a few models. Let's say two models, called FooModel with about 200 entities, and BarModel with about 1200 entities. Each has 15 StringPropertys.
for model in [FooModel, BarModel]:
print 'Downloading {}'.format(model.__name__)
new_items_iter = model.query().iter()
new_items = [i.to_dict() for i in new_items_iter]
print new_items
When I run this in my console, it hangs for a while after printing 'Downloading BarModel'. It hangs until I hit ctrl+C, at which point it prints the downloaded list of items.
When this is run in a Jenkins job, there's no one to press ctrl+C, so it just runs continuously (last night it ran for 6 hours before something, presumably Jenkins, killed it). Datastore activity logs reveal that the datastore was taking 5.5 API calls per second for the entire 6 hours, racking up a few dollars in GAE usage charges in the meantime.
Why is this happening? What's with the weird behavior of ctrl+C? Why is the iterator not finishing?
This is a known issue currently being tracked on the Google App Engine public issue tracker under Issue 12908. The issue was forwarded to the engineering team and progress on this issue will be discussed on said thread. Should this be affecting you, please star the issue to receive updates.
In short, the issue appears to be with the remote_api script. When querying entities of a given kind, it will hang when fetching 1001 + batch_size entities when the batch_size is specified. This does not happen in production outside of the remote_api.
Possible workarounds
Using the remote_api
One could limit the number of entities fetched per script execution using the limit argument for queries. This may be somewhat tedious but the script could simply be executed repeatedly from another script to essentially have the same effect.
Using admin URLs
For repeated operations, it may be worthwhile to build a web UI accessible only to admins. This can be done with the help of the users module as shown here. This is not really practical for a one-time task but far more robust for regular maintenance tasks. As this does not use the remote_api at all, one would not encounter this bug.

How do I execute code on App Engine without using servlets?

My goal is to receive updates for some service (using http request-response) all the time, and when I get a specific information, I want to push it to the users. This code must run all the time (let's say every 5 seconds).
How can I run some code doing this when the server is up (means, not by an http request that initiates the execution of this code) ?
I'm using Java.
Thanks
You need to use
Scheduled Tasks With Cron for Java
You can set your own schedule (e.g. every minute), and it will call a specified handler for you.
You may also want to look at
App Engine Modules in Java
before you implement your code. You may separate your user-facing and backend code into different modules with different scaling options.
UPDATE:
A great comment from #tx802:
I think 1 minute is the highest frequency you can achieve on App Engine, but you can use cron to put 12 tasks on a Push Queue, either with delays of 5s, 10s, ... 55s using TaskOptions.countdownMillis() or with a processing rate of 1/5 sec.

Getting ``DeadlineExceededError'' using GAE when doing many (~10K) DB updates

I am using Django 1.4 on GAE + Google Cloud SQL - my code works perfectly fine (on dev with local sqlite3 db for Django) but chocks with Server Error (500) when I try to "refresh" DB. This involves parsing certain files and creating ~10K records and saving them (I'm saving them in batch using commit_on_success).
Any advise ?
This error is raised for front end requests after 60 seconds. (its increased)
Solution options:
Use task queue (again a time limit of 10 minutes is imposed, which is practically enough).
Divide your task in smaller batches.
How we do it: we divide it on client side in smaller chunks and call them repeatedly.
Both the solutions work fine, depends on how you make these calls and want the results. Taskqueue doesn't return back the results to the client.
For tasks that take longer than 30s you should use task queue.
Also, database operations can also timeout when batch operations are too big. Try to use smaller batches.
Google app engine has a maximum time allowed for a request. If a request takes longer than 30 seconds, this error is raised. If you have a large quantity of data to upload, either import it directly from the admin console, or break up the request into smaller chunks, or use the command line python manage.py dbshell to upload the data from your computer.

GAE mapper producing 'No quota, aborting' errors

I am trying to setup a mapper job on Google app engine using the mapper framework here (java version): http://code.google.com/p/appengine-mapreduce/
I kicking the job off via code like such:
Configuration conf = new Configuration(false);
conf.setClass("mapreduce.map.class", MyMapper.class, Mapper.class);
conf.setClass("mapreduce.inputformat.class", DatastoreInputFormat.class, InputFormat.class);
conf.set(DatastoreInputFormat.ENTITY_KIND_KEY, "Organization");
// Queue up the mapper request.
String configString = ConfigurationXmlUtil.convertConfigurationToXml(conf);
Queue queue = GaeQueueFactory.getQueue(QUEUE_NAME);
queue.add(
Builder.url("/mapreduce/start")
.param("configuration", configString));
I get the following error in the logs on both the dev server and prod server:
com.google.appengine.tools.mapreduce.MapReduceServlet processMapper: No quota. Aborting!
There is no additional stack trace. This appears about a dozen or so times each time I try kick a job off.
I think you don't have enough quota to process your Mapper job with the predefined default values.
Try to lower these configuration parameters:
mapreduce.mapper.inputprocessingrate
The aggregate number of entities processed per second by all mappers.
Used to prevent large amounts of quota being used up in a short time period.
Default 1000
mapreduce.mapper.shardcount
The number of concurrent workers to use.
This also determines the number of shards to split the input into
Default = 8
I have figured out the issue and am recording this here for anyone else having this problem. What confused me the most about this is that it was working several weeks ago and seems to have stopped sometime recently. Because the app isnt being used in a production environment, no one noticed.
What changed was that I was sending map requests to a custom 'mapper' task queue; not the default task queue. Because I have several mapper jobs, I setup a different queue for all mapper jobs to use:
<queue>
<name>mapper</name>
<rate>5/s</rate>
<bucket-size>10</bucket-size>
</queue>
When I switched the code back to using the default queue, everything worked as expected. I have filed a bug with the mapper team here: http://code.google.com/p/appengine-mapreduce/issues/detail?id=73

How can I do the same thing over and over every 1-4 seconds in google app engine?

I want to run a script every few seconds (4 or less) in google app engine to process user input and generate output. What is the best way to do this?
Run a cron job.
http://code.google.com/appengine/docs/python/config/cron.html
http://code.google.com/appengine/docs/java/config/cron.html
A cron job will invoke a URL at a
given time of day. A URL invoked by
cron is subject to the same limits and
quotas as a normal HTTP request,
including the request time limit.
.
Also consider the Task Queue - http://code.google.com/appengine/docs/python/taskqueue/overview.html
Reconsider what you're doing. As Ash Kim says, you can do it with the task queue, but first take a close look if you really need to run a process like this. Is it possible to rewrite things so the task runs only when needed, or immediately, or lazily (that is, only when the results are needed)?

Resources