Need Clarification of App Engine SearchAPI Quota - google-app-engine

I'm experimenting with the Search API with appengine, and am consistently running into the "short term burst quota" described in this SO post: quotas on appengine search api for java.
In our use case, we need to delete all documents from an Index, and re-populate the index. We've attacked this by:
Looping through the list and deleting documents
Adding documents through a task queue (1/s) throughput rate
I'm still bumping against this burst limit and I'm wondering if I might have to put a sleep when I delete the documents?
This burst rate is severely limiting us (since we are building these indexes on the fly based on other criteria) and am curious if anyone has any more insight.

You should limit the usage using a queue and not sleep (never ever use sleep in AppEngine).
You can request an increase on your quota.

I believe that there is a 100 calls per minute quota as well as 20k per day (although the per minute quota may now have been removed).
The best way around this that I have found is with cursors and taskqueues to spread the load.

Related

Further explanations on YouTube API quota: eg. does `search?maxResults` add up?

So, i'm a bit confused how youtube API queries impact on quota. They all lay out on a "composition" aspec of requisitions:
The Quota calculator show the cost of various resources.
For example search.list = 100 and videos.list = 1.
One aspect of it is not clear to me though. How it calculates on a 'multi-results' single request?
query
quota doubt
/search?maxResults=10
is it one 100 quota, or ten 100 (1000) quotas?
/videos?id=A,B
is it one quota, or two quotas?
/video?part=A,B
is this adding two quotas? (each video??) Since no ?part= returns only id related data
/...?fields=A,B(C)
is fields query impacting the request quota anyhow?
I first thought it was really straightforward: 1 call, 1 quota "package". And that seemed to be supported by this calculator's quote:
If your application calls a method, such as search.list, that returns multiple pages of results, each request to retrieve an additional page of results incurs the estimated quota cost.
But while developing a simple video list, my daily quota blew up pretty damn fast. So not sure anymore.
Every time you call the method in question you incure the quota cost
For example:
search.list 100
WHen you call search.list it costs 100 if you call it again to get the next page of results it will cost you another 100 points.
Ig we check this one where you are trying to get back two videos.
/videos?id=A,B
The same is true it will be a single request to the server so the quota cost will be one.
Fields does not effect it. Its just the request you make. Batching will also not save you from quota cost. If you batch these requests you will be charged for each of the quests within the back.
Intro to YouTube API and cost based quota for beginners
A lot of this information is on the Quota cost page.
The table below shows the quota cost for calling each API method. All API requests, including invalid requests, incur a quota cost of at least one point.
If your application calls a method, such as search.list, that returns multiple pages of results, each request to retrieve an additional page of results incurs the estimated quota cost.

Dynamodb ondemand cost and scaling during hot partition(adaptive scaling)

I can understand the provisioned DB cost but there are few questions regd on-demand nodes.
does OnDemand pricing only considers the sum of WRU used by each partition or the overall WRU for the table based on the usage pattern which will be shared by each partition.
when there is a hot partition, does OnDemand increase WRU only for that partition or increases the overall WRU of the table.
does adaptive capacity work with OnDemand DB
ex:
OnDemand DB with 10 partitions and current peak at 1000WRU.
if 2 hot partitions require more than 300WRU will it use from adaptive capacity or increase the overall WRU to 3000WRU resulting in high cost?
I'm not a DynamoDB insider, so I can only answer from what I understand from their documentation.
In on-demand pricing (see https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.OnDemand) you pay exactly by the number and size of your requests. If you make one million requests, you will pay the same whether these requests were to a million different partitions, or they all went to the same partition.
You might wonder, then, why there was such an issue of load imbalance pricing in provisioned-capacity pricing - or at least why is the Web full of stories of such an issue. There should never have been such an issue, but in the past there was. But since recently, this isn't an issue any more. Here is the story:
In the provisioned pricing page, Amazon claims that if you reserve 1000 WCU, you should be able to use this number of write units that you paid for, per second, and if you try to use more, you'll be throttled. There is no mention or warning of imbalanced loads or hot partitions... But people discovered that this wasn't quite true - Amazon had a bug in their throttling code: The usage counting wasn't done across the entire cluster. Instead, if your data was spread over 10 nodes, your reservation of 1000 was evenly split among them, so each of the 10 nodes would start to throttle you after 100 (1000/10) requests per second. This split worked well for well-balanced loads, but with hot partitions, it didn't work well. People were paying for a reservation of 1000 but when they measured how much they were getting, they saw throttling after just 800 (for example) requests per second. Amazon acknowledge this was a bug, and fixed it with their "adaptive capacity" technique where each of the nodes picks a different throttling limit, modified until the user's total usage approaches what he had paid for. This technique is explained in this excellent official talk
https://www.youtube.com/watch?v=yvBR71D0nAQ - see time 19:38. Until very recently this "adaptive capacity" was a very blunt instrument, which only worked well if your workload doesn't change quickly, but since then, this issue was fixed too - as described in
https://aws.amazon.com/blogs/database/how-amazon-dynamodb-adaptive-capacity-accommodates-uneven-data-access-patterns-or-why-what-you-know-about-dynamodb-might-be-outdated/

Throttling limits for Azure Search

I'm looking for throttling information and this is the best that I've been able to find so far: https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity#throttling-limits
For doing a search
https://{{search-service}}.search.windows.net/indexes/:index/docs?api-version={{version}}&search=some text
Is this line from the reference page above the limit for searches?
Get Index (GET /indexes/myindex): 10 per second per search unit
I'm trying to see what the limit is for searching only under ideal scenario of nothing else happening such as an indexer running.
Some APIs such as GET /indexes are throttled based on simple rate limits. However, queries and indexing requests do not work this way. In the case of those APIs, throttling happens dynamically based on resource availability. If the system's internal queues start to fill, requests will begin to fail with 503 (Service Unavailable). If enough such failures happen within a discrete period of time (calculated as an average over a rolling window), the service will throttle requests in order to relieve pressure and allow the system to recover.
The reason throttling works this way instead of based on static rate limits is that most Azure Cognitive Search pricing tiers (other than Free) give you dedicated capacity. Static rate limits could artificially limit how you use your own capacity, so instead throttling dynamically applies backpressure as a way to ensure the reliability of the service when its capacity is overloaded.
For more information about testing and performance tuning Azure Cognitive Search, see this article.
For Azure search, there are 2 kinds of APIs: Query APIs (Search/Suggest/Autocomplete) and Index APIs .
The one you mentioned belongs to Index APIs:
Get Index (GET /indexes/myindex): 10 per second per search unit
If you want to know Query APIs(searching) limit (QPS limit), this doc will be helpful:

any number on Google App Engine free quota in terms of total number of request and unique visitors

Does anyone have any number on Google App Engine (free quota) in terms of total number of request and unique visitors it allows per day?
Maybe someone who has live production code can tell us this?
Rough number is enough, just to get the idea.
I can not get this information from the pricing model.
Thanks
I had this question when I first started using App Engine, but it's impossible to answer with the information in your question.
You must have an estimate on the individual API quota usages, then calculate based on that.
You might be able to simplify it by trying to figure out which API quota you're likely to hit first, and then figuring out the number of requests you can serve before that quota runs out. ie:
Storing photos or other large data for users? You'll probably hit the blobstore quota first. Daily/unique visitor counts probably won't matter.
Serving lots of photos or large data? You'll probably hit the bandwidth quota first.
Need to start a channel for every view? You'll probably hit the channel quota first and get 100 views per day.
Need to send an email for every view? You'll probably hit the mail quota first.
Need to query the datastore a lot? You'll probably hit the datastore limit first.
The datastore limit is the hardest to calculate. You get 50k read and 50k write ops. Most likely you'd read more than write.
If you need 2 read ops per page, you might could do 25k views per day.
If you need 2 read ops per page, but you're smart and you memcache them, and memcache is effective 80% of the time, you could get 125k views per day.
If you need 500 read ops per page and you can't cache it, you can do 100 views per day. That's provided you don't run out of one of the other quotas.
Do your own math.
The quotas and rates (for free and paid apps) are listed on https://developers.google.com/appengine/docs/quotas.

how fast is Google App Engine MapReduce?

How much of a compute-intensive gain can one expect on GAE MapReduce? The scenario of interest to me is compute intensive, so for example: multiplying a trillion random floats in a single threaded single core application. Then imagine 1000 MapReduce workers multiplying a billion random numbers each and announcing "finished" when all workers have finished. Assume billing is enabled if that matters. (It might not).
Edit: A commenter asked for clarification. The title has been revised. If the task takes 50000 seconds single threaded and in an alternative implementation 1000 MapReduce workers are employed and they finish after 500 seconds, then the performance gain is 100 times. 1000 workers: 100 times gain, only slightly disappointing, but so be it for this example. How can I get finished sooner? Can I ask for 10,000 workers? This question may have to do with limits and quotas. Assume an adequate budget. Does MapReduce's compute-intensive performance gain head to an asymptote and if so what is the performance gain at that asymptote? There was also information in the comment about MapReduce being suitable for large amounts of data generated by a user facing URL however, my question is not in regard to a Datastore-intensive application's performance versus the same application rewritten for MapReduce. Datastore activity will be minimal in this compute-intensive scenario. I realize there will always be some Datastore activity in any MapReduce application, but since this is a compute-intensive scenario, the Datastore activity and the size of the Datastore entities is not going to be a big influence on the performance gain calculated. The task will use the Datastore for less than 1% of the elapsed time. Nor is the scenario involving a large amount of communication bandwidth (other than the minimum necessary to hit the task queued URLs that MapReduce uses). The question is in regard to comparing a compute-intensive single threaded non-MapReduce task's elapsed time to the same task's elapsed time on MapReduce which is inherently multi-threaded given there are multiple workers. I use the word "task" generically, in other words, "task means work". The gain might (but not necessarily) be a function of the number of workers hence I mentioned 1000 workers in the example.
It's not clear exactly what you're asking here. Are you asking how efficient it is? How cheap it is? How fast it is?
In general, App Engine is designed for serving user-facing sites, and the App Engine mapreduce API exists to assist with that - processing large amounts of data generated by the user-facing site. If you have a large amount of data that's hosted outside App Engine, and you want to do some sort of large-scale data processing on it, App Engine is probably not the tool for you.
Regarding performance, you can expect each worker to execute tasks as fast as they would be if you were executing them serially, so your items-per-second is roughly the number of workers multiplied by the regular rate - there's relatively little overhead. There can be some delay at the end, though, when different workers finish at different times, and how much this is depends on how good a job mapreduce does of sharding your data. With datastore input, this used to be fairly poor, but it's a lot better now.
As to how many mappers you can have, that depends on a number of things: Whether or not your app has billing enabled, how much other traffic your app gets, and how long your mapper tasks take per element. The only real way to determine this is to experiment a bit.

Resources