Appengine: How to debug too many datastore writes?

Appengine: How to debug too many datastore writes? - google-app-engine

I have a little web shop hosted on AppEngine. It has ~100 products. Apart from orders (1/month) and registrations (1/month), I am not writing to the datastore.
I have a session included (http://gaeutilities.appspot.com/cookiesession) and some indexes for sorting.
Analytics tells me it was 1,315 page views in the last 30 days. That is ~44 per day causing 50,000 write operations (=1136/request)?
I can't really believe that. Any ideas how to debug this issue?

I would recommend taking a look at AppStats, which will give you a breakdown of all your RPC calls. It's really very handy for exactly this kind of scenario.

The logs helped (oh wonder!). It always fails by using the session:
[..]
File "/base/data/home/apps/[..]/12.376983336014703566/appengine_utilities/sessions.py", line 87, in put
db.put(self)
[..]
OverQuotaError: The API call datastore_v3.Put() required more quota than is available.

Related

Method to find API call usage

I am operating in a production environment with a number of different applications using the Amazon API. Of these, some are our own home-grown apps, and others are 3rd party shipping applications.
I have a situation where I am hitting an hourly throttle for the Reports API 'GetReport' request, and I am trying to determine what is causing us to be throttled. By my count, we shouldn't be exceeding ~60 calls per hour at the absolute maximum. (Just a note, while API info says this function call throttles at 60 requests per hour, the exception I received back indicated a cap of 120 requests per hour. Maybe the exception is wrong, and I'm hitting a 60 request cap?)
Is there either an API call to determine current call usage, or a method of accessing this information via Amazon Seller Central / Developers Program? I've done some searching around but everything I can find is describing how the throttling works which isn't my problem.
I am currently using C# Amazon MWS libraries for all function calls, although that information is a bit superfluous. Any insight into the proper API call to use, or how to gain access to this information would be greatly appreciated.

In the response to most calls you get back something like the following in the response.
"x-mws-quota-max"=>"60.0",
"x-mws-quota-remaining"=>"51.0",
"x-mws-quota-resetsOn"=>"2016-03-25T16:00:00.000Z"
You should be able to use this to figure out what is causing you to hit the limit quicker than expected. Perhaps logging out the call and the response with the data above??

Contact MWS Support here and ask for clarification on your issue. They surely know of your usage in order to be able to cap it. I met with the MWS team a few months ago in Detroit and they said any time you have a technical question to ask them. They've been really helpful to me.

Google app engine - Sudden Increase of Datastore Read Operations

I'm maintaining a blog app(blog.wokanxing.info, it's in Chinese) for myself which was built upon Google app engine. It's been like two or three years since first deployment and I've never met any quota issue because its simpicity and small visit count.
However since early last month, I noticed that from time to time the app reported 500 server error, and in admin panel it shows a mysterious fast consumption of free datatstore read operation quota. Within a single hour about 10% of free read quota (~5k ops) are consumed, but I'm counting only a dozen requests that involve datastore read ops, 30 tops, which means an average 150 to 200 read op per request, which sounds impossible to me.
I've not commited any change to my codebase for months, and I'm not seeing any change in datastore or quote policy either. Despite that, it also confuses me how such consumption can be made. I use memcache a lot, which leaves first page the biggest player, which fetch the first threads using Post.all.order('-date').fetch(10, offset). Other request merely fetch a single model using Post.get_by_key_name and iterates post.comment_set.
Sorry for my poor English, but can anyone give me some clues? Thanks.

From Admin console check your log.
Do not check for errors only, rather check all types of messages inside the log.
Look for the requests made by robots/web crawlers. In most cases, you can detect such "users" by words "robot" or "bot" (well, if they are honest...).
The first thing you can do is to edit your "robot" file. For more detail read How to identify web-crawler? . Also, GAE has help for use of "robot" file.
If that fails, try to detect IP address used by bot/bots. Using GAE Admin console put such addresses in blacklist and check your quota consumption again.

Identify why Google app engine is slow

I developed an application for client that uses Play framework 1.x and runs on GAE. The app works great, but sometimes is crazy slow. It takes around 30 seconds to load simple page but sometimes it runs faster - no code change whatsoever.
Are there any way to identify why it's running slow? I tried to contact support but I couldnt find any telephone number or email. Also there is no response on official google group.
How would you approach this problem? Currently my customer is very angry because of slow loading time, but switching to other provider is last option at the moment.

Use GAE Appstats to profile your remote procedure calls. All of the RPCs are slow (Google Cloud Storage, Google Cloud SQL, ...), so if you can reduce the amount of RPCs or can use some caching datastructures, use them -> your application will be much faster. But you can see with appstats which parts are slow and if they need attention :) .
For example, I've created a Google Cloud Storage cache for my application and decreased execution time from 2 minutes to under 30 seconds. The RPCs are a bottleneck in the GAE.

Google does not usually provide a contact support for a lot of services. The issue described about google app engine slowness is probably caused by a cold start. Google app engine front-end instances sleep after about 15 minutes. You could write a cron job to ping instances every 14 minutes to keep the nodes up.

Combining some answers and adding a few things to check:
Debug using app stats. Look for "staircase" situations and RPC calls. Maybe something in your app is triggering RPC calls at certain points that don't happen in your logic all the time.
Tweak your instance settings. Add some permanent/resident instances and see if that makes a difference. If you are spinning up new instances, things will be slow, for probably around the time frame (30 seconds or more) you describe. It will seem random. It's not just how many instances, but what combinations of the sliders you are using (you can actually hurt yourself with too little/many).
Look at your app itself. Are you doing lots of memory allocations in the JVM? Allocating/freeing memory is inherently a slow operation and can cause freezes. Are you sure your freezing is not a JVM issue? Try replicating the problem locally and tweak the JVM xmx and xms settings and see if you find similar behavior. Also profile your application locally for memory/performance issues. You can cut down on allocations using pooling, DI containers, etc.
Are you running any sort of cron jobs/processing on your front-end servers? Try to move as much as you can to background tasks such as sending emails. The intervals may seem random, but it can be a result of things happening depending on your job settings. 9 am every day may not mean what you think depending on the cron/task options. A corollary - move things to back-end servers and pull queues.
It's tough to give you a good answer without more information. The best someone here can do is give you a starting point, which pretty much every answer here already has.

By making at least one instance permanent, you get a great improvement in the first use. It takes about 15 sec. to load the application in the instance, which is why you experience long request times, when nobody has been using the application for a while

Why am I hitting the datastore read operation quota?

I was in a place without Internet access for 3 weeks and just came back to find out that one of my apps since January 18 started to reach a quota limit (Datastore Read Operations) after around the 18 hours.
I don't see any increase in traffic from either users or crawlers.
This is the error in the logs:
"The API call datastore_v3.RunQuery() required more quota than is available."
It seems very strange since this application has been running for some years and I'm memcaching most the datastore requests.
Please help - This is affecting my bottom line!
Thanks.

I found a subset of pages in the site that had got a sudden interest from several crawlers and some of the requests that those pages made to the Datastore were not being memcached, so that was it...problem solved.
Thanks.

Google app engine excessive small datastore operations

I'm having some trouble with the google app engine datastore. Ever since the new pricing model was introduced, the cost of running my app has increased massively.
The culprit appears to be "Datastore small operations", which come in at more than 20 Million ops per day!
Has anyone had this problem, I don't think I'm doing an excessive amount of key lookups, and I only have 5000 users, with roughly 10 - 20 requests per minute.
Thanks in advance!
Edit
Ok got some stats, these are after abut 3 hours. Here is what I am seeing in my dashboard, in the billing section:
And here are some of the stats:
Obviously there are quite a lot of calls to datastore.get. I am starting to think that it is my design that is causing the problem. Those gets correspond to accounts. Every user has an account, but an account can be one of two types, for this I use composition. So each account entity has a link to its sub account entity.
As a result when I do a search for nearby users it involves fetching the accounts using the query, and then doing a get on each account to get its sub account. The top request in the stats picture is a call that gets 100 accounts, and then has to do a get on each one. I would have thought that this was a very light query, but I guess not. And I am still confused by the number of datastore small ops being recorded in my dashboard.

Definitely use appstats as Drew suggests; regardless of what library you're using, it will tell you what operations your handlers are doing. The most likely culprits are keys-only queries and count operations.

My advice would be to use AppStats (Python / Java) to profile your traffic and figure out which handler is generating the most datastore ops. If you post the code here we can potentially suggest optimizations.

Don't scan your datastore, use get(key) or get_by_id(id) or get_by_key_name(keyname) as much as you can.

Do you have lots of ReferenceProperty properties in your models? Accessing them will trigger db.get for each property unless you prefetch them. This would trigger 101 db.get requests.
class Foo(db.Model):
user = db.ReferenceProperty(User)
foos = Foo.all().fetch(100)
for f in foos:
print f.user.name # this triggers db.get(parent=f, key=f.user)