GAE mapper producing 'No quota, aborting' errors - google-app-engine

I am trying to setup a mapper job on Google app engine using the mapper framework here (java version): http://code.google.com/p/appengine-mapreduce/
I kicking the job off via code like such:
Configuration conf = new Configuration(false);
conf.setClass("mapreduce.map.class", MyMapper.class, Mapper.class);
conf.setClass("mapreduce.inputformat.class", DatastoreInputFormat.class, InputFormat.class);
conf.set(DatastoreInputFormat.ENTITY_KIND_KEY, "Organization");
// Queue up the mapper request.
String configString = ConfigurationXmlUtil.convertConfigurationToXml(conf);
Queue queue = GaeQueueFactory.getQueue(QUEUE_NAME);
queue.add(
Builder.url("/mapreduce/start")
.param("configuration", configString));
I get the following error in the logs on both the dev server and prod server:
com.google.appengine.tools.mapreduce.MapReduceServlet processMapper: No quota. Aborting!
There is no additional stack trace. This appears about a dozen or so times each time I try kick a job off.

I think you don't have enough quota to process your Mapper job with the predefined default values.
Try to lower these configuration parameters:
mapreduce.mapper.inputprocessingrate
The aggregate number of entities processed per second by all mappers.
Used to prevent large amounts of quota being used up in a short time period.
Default 1000
mapreduce.mapper.shardcount
The number of concurrent workers to use.
This also determines the number of shards to split the input into
Default = 8

I have figured out the issue and am recording this here for anyone else having this problem. What confused me the most about this is that it was working several weeks ago and seems to have stopped sometime recently. Because the app isnt being used in a production environment, no one noticed.
What changed was that I was sending map requests to a custom 'mapper' task queue; not the default task queue. Because I have several mapper jobs, I setup a different queue for all mapper jobs to use:
<queue>
<name>mapper</name>
<rate>5/s</rate>
<bucket-size>10</bucket-size>
</queue>
When I switched the code back to using the default queue, everything worked as expected. I have filed a bug with the mapper team here: http://code.google.com/p/appengine-mapreduce/issues/detail?id=73

Related

NDB query().iter() of 1000<n<1500 entities is wigging out

I have a script that, using Remote API, iterates through all entities for a few models. Let's say two models, called FooModel with about 200 entities, and BarModel with about 1200 entities. Each has 15 StringPropertys.
for model in [FooModel, BarModel]:
print 'Downloading {}'.format(model.__name__)
new_items_iter = model.query().iter()
new_items = [i.to_dict() for i in new_items_iter]
print new_items
When I run this in my console, it hangs for a while after printing 'Downloading BarModel'. It hangs until I hit ctrl+C, at which point it prints the downloaded list of items.
When this is run in a Jenkins job, there's no one to press ctrl+C, so it just runs continuously (last night it ran for 6 hours before something, presumably Jenkins, killed it). Datastore activity logs reveal that the datastore was taking 5.5 API calls per second for the entire 6 hours, racking up a few dollars in GAE usage charges in the meantime.
Why is this happening? What's with the weird behavior of ctrl+C? Why is the iterator not finishing?
This is a known issue currently being tracked on the Google App Engine public issue tracker under Issue 12908. The issue was forwarded to the engineering team and progress on this issue will be discussed on said thread. Should this be affecting you, please star the issue to receive updates.
In short, the issue appears to be with the remote_api script. When querying entities of a given kind, it will hang when fetching 1001 + batch_size entities when the batch_size is specified. This does not happen in production outside of the remote_api.
Possible workarounds
Using the remote_api
One could limit the number of entities fetched per script execution using the limit argument for queries. This may be somewhat tedious but the script could simply be executed repeatedly from another script to essentially have the same effect.
Using admin URLs
For repeated operations, it may be worthwhile to build a web UI accessible only to admins. This can be done with the help of the users module as shown here. This is not really practical for a one-time task but far more robust for regular maintenance tasks. As this does not use the remote_api at all, one would not encounter this bug.

How do I execute code on App Engine without using servlets?

My goal is to receive updates for some service (using http request-response) all the time, and when I get a specific information, I want to push it to the users. This code must run all the time (let's say every 5 seconds).
How can I run some code doing this when the server is up (means, not by an http request that initiates the execution of this code) ?
I'm using Java.
Thanks
You need to use
Scheduled Tasks With Cron for Java
You can set your own schedule (e.g. every minute), and it will call a specified handler for you.
You may also want to look at
App Engine Modules in Java
before you implement your code. You may separate your user-facing and backend code into different modules with different scaling options.
UPDATE:
A great comment from #tx802:
I think 1 minute is the highest frequency you can achieve on App Engine, but you can use cron to put 12 tasks on a Push Queue, either with delays of 5s, 10s, ... 55s using TaskOptions.countdownMillis() or with a processing rate of 1/5 sec.

Tasks targeted at dynamic backend fail frequently, silently

I had converted some tasks to run on a dynamic backend.
The tasks are failing silently [no logged error, no retry, nothing] ~20% of the time (min:10%, max:60%, sample:large, long term). Switching the task away from the backend restores retries and gets the failure rate back to ~0%.
Any ideas?
Converting it to a backend exacerbated the problem but wasn't the problem.
I had specified a task_retry_limit and the queue was a push queue. With a backend the number of instances is specified. (I believe you can replicate this issue on the frontend by ramping up requests rapidly, to a big number).
Tasks were failing 503: Instance Unavailable until they hit the task_retry_limit. This is visible temporarily in Task Queues, but will not show up in Logs.
I should be using pull queues. Even if my use case was stupid I'd probably +1 a task dying due to multiple 503: Instance Unavailable logging something so it doesn't appear like a phantom task.
Which runtime are you using on the backend?
Try running the backend for a bit without dynamic set to true and exercise the failing component.
On my project, I have seen tasks that target a static backend disappear on occasion, but no where near the rate you are seeing.

Getting ``DeadlineExceededError'' using GAE when doing many (~10K) DB updates

I am using Django 1.4 on GAE + Google Cloud SQL - my code works perfectly fine (on dev with local sqlite3 db for Django) but chocks with Server Error (500) when I try to "refresh" DB. This involves parsing certain files and creating ~10K records and saving them (I'm saving them in batch using commit_on_success).
Any advise ?
This error is raised for front end requests after 60 seconds. (its increased)
Solution options:
Use task queue (again a time limit of 10 minutes is imposed, which is practically enough).
Divide your task in smaller batches.
How we do it: we divide it on client side in smaller chunks and call them repeatedly.
Both the solutions work fine, depends on how you make these calls and want the results. Taskqueue doesn't return back the results to the client.
For tasks that take longer than 30s you should use task queue.
Also, database operations can also timeout when batch operations are too big. Try to use smaller batches.
Google app engine has a maximum time allowed for a request. If a request takes longer than 30 seconds, this error is raised. If you have a large quantity of data to upload, either import it directly from the admin console, or break up the request into smaller chunks, or use the command line python manage.py dbshell to upload the data from your computer.

Appengine "copy to another app" job do not stop (After 3 weeks)

I set up a staging envierment to a web app I created about 3 weeks ago,
and I tried to transfer the data from the production envierment that was
already set up using the Copy data to another app in the datastore admin.
The data was indeed copied to my staging envierment. The problem was that
the copy jobs are still running, 3 weeks after they were fired! (It took the data
about 3 hours to transfer to my staging evnierment. )
I tried to cancel the jobs using the abort option, with no luck.
As for now, 7 out of the 14 jobs are listed as completed, and the others are
listed as active. My /_ah/mapreduce/controller_callback handler is bombarded with
3.1 posts per second, and I think it got to a point it is harming my site performance, not to mention costing me money...
How do I get the tasks to abort?
You can purge your task queues from the normal datastore admin task queues section. That will force the jobs to stop.
You can clean up the mapreduce jobs by deleting the entities they store in the datastore to keep track of their progress - they are called "mr_progress" or something like that.
I had a similar situation where DataStore Admin tasks would not die. The tasks were from a copy and / or backup operation. Aborting the jobs didn't do anything for me, either. Even after going into the task queue and purging the queue, the tasks would reappear. I did all of the following to keep the tasks dead (not sure if all steps are necessary!):
Paused the task queue (click task Queues, and then the queue in question).
Purged the queue (clicking purge repeatedly)
Went into the datastore viewer and deleted all DatastoreAdmin entries with Active status, and MapReduce items. I'm not sure if this is necessary.
Changed the queue.yaml file to include a task_age_limit of 10s and task_retry_limit of 3 under the retry_parameters.
I originally was trying to do a large backup and copy to another app at the same time, during which the destination app ran into a quota. Not sure if this caused the issues. Riley Lark's answer got me started.
One other addition: when I went into the task queue and clicked the task name to expand the details, nothing ever loaded into the "Headers", "Raw Payload", etc. tabs. They just said "loading..."

Resources