Search API on Google App Engine - google-app-engine

I am running a proof of concept using App Engine and the built in Search API. We are testing the Search API under the assumption that it provides linear scaling as is the case with other products and services that are bundled with App Engine.
Specs: approx. 8 million documents in a single index
Query type:
Complex queries, we need spatial queries based on square areas, not
distance(!). All queries include 2 ranges based on latitude and
longitude.
Page sizes: between 16 and 250.
Accuracy (result counting) set to 100 in all test cases.
Our target performance (latency) is in the 100's of milliseconds range.
We are testing performance of the Search API running several concurrent requests. Test results are now measured at about 25 concurrent requests, but this number is expected to go up significantly. However, if the Search API is properly scalable then this should be meaningless.
I am measuring the time it takes the Search API to process a call to Index.search(Query).
What I am measuring is the following:
The average time it takes the search method to return is around 8000 ms. There are no cases in which the method returns significantly faster or slower than that. However, using an index with 10 documents results in latency measurements of around 300 ms (!!!). This might be an indication that the Search API is not scalable at all.
The page size does not seem to make any significant differences. Perhaps at page sizes of 10.000 or higher it will, but this is not part of our tests.
Adding one criteria (equality) seems to speed up the search significantly. Up to approximately 40% improvement. This seems like a nice improvement, but 4 seconds is still an eternity.
Questions:
What is the expected latency (best possible scenario/configuration) that the Search API can deliver?
Which parameters influence latency including app engine configuration.
Does the number of documents in an index influence latency?
Is a search based on 2 range queries slower than a search based on equality filters alone? (because we could pre-process the data and add 'index' data to each document).
Is the Search API really scalable?

Our application for this was to plot a number of markers on a map using a tile server. However, the tile server performs many queries (i.e. 'tiles') in parallel, almost 30 per user/view. To make things difficult, we were not able to solve this problem using pre-aggregated maps because we have too many parameters/dimensions to take care of (if this is the case for you then try: Google Maps Engine).
So, we ended up with a CloudSQL instance set to the highest tier for max. performance. Another reason to use a relational database is that index performance is more precisely tunable as opposed to the Search API or BigQuery.
To answer the questions, this is what we found:
The latency depends on the size of the index. At lower volumes per index the latency seems reasonable. At much higher volumes this may become a problem. But for text-searched this is probably ok in most cases.
We did not test at lower volumes but at around 8 million documents, the latency sits between 5000 - 8000 ms. per query. We did not find any parameters that decreased latency, we did find parameters that increased latency.
Yes.
We did not test this.
Yes.

Related

Azure Search paging causes throttling

I am from the team that runs nuget.org, the package ecosystem for .NET. We use Azure Search to power our search API. Our APIs are public, so third-party customers can use them to analyze our ecosystem or make apps.
We recently had an outage caused by a single customer paging through our search documents using the $skip and $top query parameters in batches of 200 documents at a time. This resulted in Azure Search throttling:
Failed to execute request because the request rate has caused your service to exceed the limits of its provisioned capacity. Reduce the rate of requests, or adjust the number of replicas/partitions. See http://aka.ms/azure-search-throttling for more information.
Azure Search's throttling affected all customers in that region for 10 minutes, not just the single customer that was paging. We read through Azure Search's throttling documentation, but have the following questions:
Is customer paging with high $skip values particularly expensive for Azure Search?
What can we do to reduce the likelihood of Azure Search throttling for paging scenarios?
Should we add our own throttling to ensure a single customer’s searches doesn’t affect all other customers' searches? Does Azure Search have guidance on this?
Some more information about our service:
Number of documents in index: ~950K
Request volume: 1.3K paging requests in ~10 minutes. Peak of 125 requests per second, average of 6 requests per second
Scale: standard SKU, 1 partition, 3 replicas (this is our secondary region, hence the smaller scale to save money)
Deep paging is indeed a costly operation. Since Azure Search is designed to be distributed, all indexes are divided into multiple shards, to allow for quick scale operation. This comes with the downside that ranked results from each shard need to be merged and ranked to create a final list of results. The number of results to merge increases linearly with the skip value, so that step can become expensive when paging very deep in the results.
As a search service, Azure Search is optimized for quick retrieval of top documents based on textual relevance. It's unfortunately not the best tool for scenarios where a client simply want to return a list of all documents in a data source.
From what I understand in your post, there's 2 reasons for the throttling
High skip values
Sharp increase in QPS
We encourage you to control both. It is not uncommon for our customers to implement their own throttling logic to prevent their own customers from emitting an abnormally large amount of requests. Even without skip values, having a single customer send enough queries to increase the traffic multiple-fold can lead to throttling (I'm not sure if that was the case here). There's no official guidelines as to how to handle queries coming from your client apps. The best approach in my opinion would probably be for your team to run performance tests using realistic workloads to try to understand the limits of your search service (which depends on the index schema, number of documents, type of queries being emitted, etc.). Once you have a good idea of how many QPS your service can handle for your scenarios, then you can decide how much of that QPS you are willing to allocate to a single customer at a time, and enforce a limit based on that.
Regarding the deep paging cost: if this is a common scenario for your customers (paging through all documents of a search index), I would recommend you expose a way to page through all documents directly from the data source (assuming Azure Search is not the primary data store of the documents), and mostly use Azure Search for relevance related retrieval scenarios only.

Dynamodb ondemand cost and scaling during hot partition(adaptive scaling)

I can understand the provisioned DB cost but there are few questions regd on-demand nodes.
does OnDemand pricing only considers the sum of WRU used by each partition or the overall WRU for the table based on the usage pattern which will be shared by each partition.
when there is a hot partition, does OnDemand increase WRU only for that partition or increases the overall WRU of the table.
does adaptive capacity work with OnDemand DB
ex:
OnDemand DB with 10 partitions and current peak at 1000WRU.
if 2 hot partitions require more than 300WRU will it use from adaptive capacity or increase the overall WRU to 3000WRU resulting in high cost?
I'm not a DynamoDB insider, so I can only answer from what I understand from their documentation.
In on-demand pricing (see https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.OnDemand) you pay exactly by the number and size of your requests. If you make one million requests, you will pay the same whether these requests were to a million different partitions, or they all went to the same partition.
You might wonder, then, why there was such an issue of load imbalance pricing in provisioned-capacity pricing - or at least why is the Web full of stories of such an issue. There should never have been such an issue, but in the past there was. But since recently, this isn't an issue any more. Here is the story:
In the provisioned pricing page, Amazon claims that if you reserve 1000 WCU, you should be able to use this number of write units that you paid for, per second, and if you try to use more, you'll be throttled. There is no mention or warning of imbalanced loads or hot partitions... But people discovered that this wasn't quite true - Amazon had a bug in their throttling code: The usage counting wasn't done across the entire cluster. Instead, if your data was spread over 10 nodes, your reservation of 1000 was evenly split among them, so each of the 10 nodes would start to throttle you after 100 (1000/10) requests per second. This split worked well for well-balanced loads, but with hot partitions, it didn't work well. People were paying for a reservation of 1000 but when they measured how much they were getting, they saw throttling after just 800 (for example) requests per second. Amazon acknowledge this was a bug, and fixed it with their "adaptive capacity" technique where each of the nodes picks a different throttling limit, modified until the user's total usage approaches what he had paid for. This technique is explained in this excellent official talk
https://www.youtube.com/watch?v=yvBR71D0nAQ - see time 19:38. Until very recently this "adaptive capacity" was a very blunt instrument, which only worked well if your workload doesn't change quickly, but since then, this issue was fixed too - as described in
https://aws.amazon.com/blogs/database/how-amazon-dynamodb-adaptive-capacity-accommodates-uneven-data-access-patterns-or-why-what-you-know-about-dynamodb-might-be-outdated/

Throttling limits for Azure Search

I'm looking for throttling information and this is the best that I've been able to find so far: https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity#throttling-limits
For doing a search
https://{{search-service}}.search.windows.net/indexes/:index/docs?api-version={{version}}&search=some text
Is this line from the reference page above the limit for searches?
Get Index (GET /indexes/myindex): 10 per second per search unit
I'm trying to see what the limit is for searching only under ideal scenario of nothing else happening such as an indexer running.
Some APIs such as GET /indexes are throttled based on simple rate limits. However, queries and indexing requests do not work this way. In the case of those APIs, throttling happens dynamically based on resource availability. If the system's internal queues start to fill, requests will begin to fail with 503 (Service Unavailable). If enough such failures happen within a discrete period of time (calculated as an average over a rolling window), the service will throttle requests in order to relieve pressure and allow the system to recover.
The reason throttling works this way instead of based on static rate limits is that most Azure Cognitive Search pricing tiers (other than Free) give you dedicated capacity. Static rate limits could artificially limit how you use your own capacity, so instead throttling dynamically applies backpressure as a way to ensure the reliability of the service when its capacity is overloaded.
For more information about testing and performance tuning Azure Cognitive Search, see this article.
For Azure search, there are 2 kinds of APIs: Query APIs (Search/Suggest/Autocomplete) and Index APIs .
The one you mentioned belongs to Index APIs:
Get Index (GET /indexes/myindex): 10 per second per search unit
If you want to know Query APIs(searching) limit (QPS limit), this doc will be helpful:

How many transactions per second can happen in MongoDB?

I am writing a feature that might lead to us executing a few 100s or even 1000 mongodb transactions for a particular endpoint. I want to know if there is a maximum limit to the number of transactions that can occur in mongodb?
I read this old answer about SQL server Can SQL server 2008 handle 300 transactions a second? but couldn't find anything on mongo
It's really hard to find a non-biased benchmark, let alone the benchmark that your objectively reflect your projected workload.
Here is one, by makers of Cassandra (obviously, here Cassandra wins): Cassandra vs. MongoDB vs. Couchbase vs. HBase
few thousand operations/second as a starting point and it only goes up as the cluster size grows.
Once again - numbers here is just a baseline and can not be used to correctly estimate the performance of your application on your data. Not all the transactions are created equal.
Well, this isn't a direct answer to your question, but since you have quoted a comparison, I would like to share an experience with Couchbase. When it comes to Couchbase: a cluster's performance is usually limited by the network bandwidth (assuming you have given it SSD/NVMe storage which improves the storage latency). I have achieved in excess of 400k TPS on a 4 node cluster running Couchbase 4.x and 5.x. in a K/V use case.
Node specs below:
12 core x 2 Xeon on HP BL460c blades
SAS SSD's (NVMe would generally be a lot better)
10 GBPS network within the blade chassis
Before we arrived here, we moved on from MongoDB that was limiting the system throughput to a few tens of thousand at most.

Google App Engine - stats on datastore reads for a complete application

In order to decrease the cost of an existing application which over-consumes on Datastore reads, I am trying to get stats on the application as a whole.
What I'd like to get for the overall application is stats about the queries that are returning the biggest number of rows during a complete day of production. The cost of retrieving data being $0.70 / million, there is a big incentive to optimise / cache some queries but first I have to understand which query retrieves too much data.
Appstats apparently does not provide this information as the tool's primary driver is to optimise one RPC call.
Does anyone has a magic solution for this one ? One alternative I thought about was to build by myself a tool to log after each query the number of rows returned but that looks like an overkill and will require to open the code.
Thanks a lot for your help !
Hugues
See this related post: https://stackoverflow.com/questions/11282567/calculating-datastore-api-usage-per-request/
What you can do to measure and optimize is to look at the cost field provided by the LogService. (It's called cpm_usd in the admin panel).
Using this information you can find the most expensive urls and thus optimize its queries.

Resources