When using the google app engine search API, if we have a query that returns a large result set (>1000), and need to iterate using the cursor to collect the entire result set, we are getting indeterminate results for the documents returned if the number_found_accuracy is lower than our result size.
In other words, the same query ran twice, collecting all the documents via cursors, does not return the same documents, UNLESS our number_found_accuracy is higher than the result size (ex, using the 10000 max). Only then do we always get the same documents.
Our understanding of how the number_found_accuracy is supposed to work is that it would only affect the number_found estimation. We assumed that if you use the cursor to get all the results, you would be able to get the same results as if you had run one large query.
Are we mis-understanding the use of the number_found_accuracy or cursors, or have we found a bug?
Your understanding of number_found_accuracy is correct. I think that the behavior you're observing is the surprising interplay between replicated query failover and how queries that specify number_found_accuracy affect future queries using continuation tokens.
When you index documents using the Search API, we store them in several distinct replicas using the same mechanism as the High Replication Datastore (i.e., Megastore). How quickly those documents become live on each replica depends on many factors. It's usually immediate, but the delay can become much longer if you're doing batch writes to a single (index, namespace) pair.
Searches can get executed on any of these replicas. We'll even potentially run a search that uses a continuation token on a different replica than the original search. If the original replica and/or continuation replica are catching up on their indexing work, they might have different sets of live documents. It will become consistent "eventually" but it's not always immediate.
If you specify number_found_accuracy on a query, we have to run most of the query as if we're going to return number_found_accuracy results. We specifically have to read much further down the posting lists to find and count matching documents. Reading a posting list results in its associated file block being inserted into various caches.
In turn, when you do the search using a cursor we'll be able to read the document for real much more quickly on the same replica that we'd used for the original search. You're thus less likely to have the continuation search failover to a different replica that might not have finished indexing the same set of documents. I think that the inconsistent results you've observed are the result of this kind of continuation query failover.
In summary, setting number_found_accuracy to something large effectively prewarms that replica's cache. It will thus almost certainly be the fastest replica for a continuation search. In the face of replicas that are trying to catch up on indexing, that will give the appearance that number_found_accuracy has a direct effect on the consistency of results, but in reality it's just a side-effect.
Related
I'm new to mongodb but not new to databases. I created a collection of documents that look like this:
{_id: ObjectId('5e0d86e06a24490c4041bd7e')
,
,
match[{
_id: ObjectId(5e0c35606a24490c4041bd71),
ts: 1234456,
,
,}]
}
So there is a list of objects on the documents and within the list there might be many objects with the same _id field. I have a handful of documents in this collection and my query that selects on selected match._id's is horribly slow. I mean unnaturally slow.
Query is simply this: {match: {$elemMatch: {_id:match._id }}} and literally hangs the system for like 15 seconds returning 15 matching documents out of 25 total!
I put an index on the collection like this:
collection.createIndex({"match._id" : 1}) but that didn't help.
Explain says execution time is 0 and says it's using the index but it still takes 15 seconds or longer to complete.
I'm getting the same slowness in nodejs and in compass.
Explain Output:
{"explainVersion":"1","queryPlanner":{"namespace":"hp-test-39282b3a-9c0f-4e1f-b953-0a14e00ec2ef.lead","indexFilterSet":false,"parsedQuery":{"match":{"$elemMatch":{"_id":{"$eq":"5e0c3560e5a9e0cbd994fa52"}}}},"maxIndexedOrSolutionsReached":false,"maxIndexedAndSolutionsReached":false,"maxScansToExplodeReached":false,"winningPlan":{"stage":"FETCH","filter":{"match":{"$elemMatch":{"_id":{"$eq":"5e0c3560e5a9e0cbd994fa52"}}}},"inputStage":{"stage":"IXSCAN","keyPattern":{"match._id":1},"indexName":"match._id_1","isMultiKey":true,"multiKeyPaths":{"match._id":["match"]},"isUnique":false,"isSparse":false,"isPartial":false,"indexVersion":2,"direction":"forward","indexBounds":{"match._id":["[ObjectId('5e0c3560e5a9e0cbd994fa52'), ObjectId('5e0c3560e5a9e0cbd994fa52')]"]}}},"rejectedPlans":[]},"executionStats":{"executionSuccess":true,"nReturned":15,"executionTimeMillis":0,"totalKeysExamined":15,"totalDocsExamined":15,"executionStages":{"stage":"FETCH","filter":{"match":{"$elemMatch":{"_id":{"$eq":"5e0c3560e5a9e0cbd994fa52"}}}},"nReturned":15,"executionTimeMillisEstimate":0,"works":16,"advanced":15,"needTime":0,"needYield":0,"saveState":0,"restoreState":0,"isEOF":1,"docsExamined":15,"alreadyHasObj":0,"inputStage":{"stage":"IXSCAN","nReturned":15,"executionTimeMillisEstimate":0,"works":16,"advanced":15,"needTime":0,"needYield":0,"saveState":0,"restoreState":0,"isEOF":1,"keyPattern":{"match._id":1},"indexName":"match._id_1","isMultiKey":true,"multiKeyPaths":{"match._id":["match"]},"isUnique":false,"isSparse":false,"isPartial":false,"indexVersion":2,"direction":"forward","indexBounds":{"match._id":["[ObjectId('5e0c3560e5a9e0cbd994fa52'), ObjectId('5e0c3560e5a9e0cbd994fa52')]"]},"keysExamined":15,"seeks":1,"dupsTested":15,"dupsDropped":0}},"allPlansExecution":[]},"command":{"find":"lead","filter":{"match":{"$elemMatch":{"_id":"5e0c3560e5a9e0cbd994fa52"}}},"skip":0,"limit":0,"maxTimeMS":60000,"$db":"hp-test-39282b3a-9c0f-4e1f-b953-0a14e00ec2ef"},"serverInfo":{"host":"Dans-MacBook-Pro.local","port":27017,"version":"5.0.9","gitVersion":"6f7dae919422dcd7f4892c10ff20cdc721ad00e6"},"serverParameters":{"internalQueryFacetBufferSizeBytes":104857600,"internalQueryFacetMaxOutputDocSizeBytes":104857600,"internalLookupStageIntermediateDocumentMaxSizeBytes":104857600,"internalDocumentSourceGroupMaxMemoryBytes":104857600,"internalQueryMaxBlockingSortMemoryUsageBytes":104857600,"internalQueryProhibitBlockingMergeOnMongoS":0,"internalQueryMaxAddToSetBytes":104857600,"internalDocumentSourceSetWindowFieldsMaxMemoryBytes":104857600},"ok":1}
The explain output confirms that the operation that was explained is perfectly efficient. In particular we see:
The expected index being used with tight indexBounds
Efficient access of the data (totalKeysExamined == totalDocsExamined == nReturned)
No meaningful duration ("executionTimeMillis":0 which implies that the operation took less than 0.5ms for the database to execute)
Therefore the slowness that you're experiencing for that particular operation is not related to the efficiency of the plan itself. This doesn't always rule out the database (or its underlying server) as the source of the slowness completely, but it is usually a pretty strong indicator that either the problem is elsewhere or that there are multiple factors at play.
I would suggest the following as potential next steps:
Check the mongod log file (you can confirm its location by running db.adminCmd("getCmdLineOpts") via the shell connected to the instance). By default any operation slower than 100ms is captured. This will help in a variety of ways:
If there is a log entry (with a meaningful duration) then it confirms that the slowness is being introduced while the database is processing the operation. It could also give some helpful hints as to why that might be the case (waiting for locks or server resources such as storage for example).
If an associated entry cannot be found, then that would be significantly stronger evidence that we are looking in the wrong place for the source of the slowness.
Is the operation that you gathered explain for the exact one that the application and Compass are observing as being slow? Were you connected to the same server and namespace? Is the explained operation simplified in some way, such as the original operation containing sort, projection, collation, etc?
As a relevant example that combines these two, I notice that there are skip and limit parameters applied to the command explained on a mongod seemingly running on a laptop. Are those parameters non-zero when running the application and does the application run against a different database with a larger data set?
The explain command doesn't include everything that an application would. Notably absent is the actual time it takes to send the results across the network. If you had particularly large documents that could be a factor, though it seems unlikely to be the culprit in this particular situation.
How exactly are you measuring the full execution time? Does it potentially include the time to connect to the database? In this case you mentioned that Compass itself also demonstrates the slowness, so that may rule out most of this.
What else is running on the server hosting the database? Is there a container or VM involved? Would the database or the underlying server be experiencing resource contention due to concurrency?
Two additional minor asides:
25 total documents in a collection is extremely small. I would expect even the smallest hardware to be able to process such a request without an index unless there was some complicating factor.
Assuming that match is always an array then the $elemMatch operator is not strictly necessary for this particular query. You can read more about that here. I would not expect this to have a performance impact for your situation.
Geomesa is a spatial temporal database, more details are available here: http://www.geomesa.org/
I am trying the example tutorial, by setting up Hbase database with it. I am running the Hbase QuickStart tutorial http://www.geomesa.org/documentation/tutorials/geomesa-quickstart-hbase.html The tutorial runs fine, below are some of the problems which I notice in the query performance of bounding box.
I have inserted data with lat, lng range (30,60) to (35,65)
In this settings, I am doing query on my local machine:
a) In my first query, the location bounding box is: (30,60) to (30.1,60.1), it runs on an average in less than a second and return correct results.
b) In second query, I modified the location bounding box (10,10) to (30.1,60.1). This query also returns the same results as in query (a), which is expected, but on an average it takes around 3-4 seconds per query.
Since both queries should give me same results, but one is running much faster than the other. I notice the similar behavior in time domain queries too where the performance is even much worse (10x times slower or even more) if time ranges are not matching with data inserted. Below are some of my questions:
1) Is this expected behavior ?
2) I know one of the solution can be to reformat the query to map to the actual data spatial and temporal ranges inserted into Geomesa, which will require me to maintain additional metadata about the data. But, I think a better solution might be designed at Geomesa layer ?
Do, let me know if there is some kind of settings etc, which can affect this behavior. I have seen the same behavior on multiple other local machines and on cloud VMS by setting up Geomesa.
In general, GeoMesa still has to scan where there might be data, even if there isn't actually any data there. Opening a scan, even if it returns no data, takes some time. For temporal queries, the number of ranges tends to be even larger, hence the slower performance.
I believe that Accumulo handles this a bit better than HBase, as it has a concept of a batch scanner that accepts multiple ranges, and it has some knowledge of the data start/end. For HBase, GeoMesa has to run multiple scans using a thread pool, so it's not as efficient.
GeoMesa also has the concept of data statistics, but it hasn't been implemented for HBase yet, and it's not currently used in query planning.
To mitigate the issue, you can try increasing the "queryThreads" data store parameter, in order to use more threads during queries. You can also enable "looseBoundingBox", if you have currently disabled it. For temporal queries, increasing the temporal binning period may cause fewer ranges to be scanned. However this may result in slower queries for very small temporal ranges, so it should be tailored to your use case.
As a final note, make sure that you have the distributed coprocessors installed and enabled, especially if you are not using loose bounding boxes.
In my application I have a datastore query with a filter, such as:
datastore.NewQuery("sometype").Filter("SomeField<", 10)
I'm using a cursor to iterate batches of the result (e.g in different tasks). If the value of SomeField is changed while iterating over it, the cursor will no longer work on google app engine (works fine on devappserver).
I have a test project here: https://github.com/fredr/appenginetest
In my test I ran /db that will setup the db with 10 items with their values set to 0, then ran /run/2 that will iterate over all items where the value is less than 2, in batches of 5, and update the value of each item to 2.
The result on my local devappserver (all items are updated):
The result on appengine (only five items are updated):
Am I doing something wrong? Is this a bug? Or is this the expected result?
In the documentation it states:
Cursors don't always work as expected with a query that uses an inequality filter or a sort order on a property with multiple values.
The problem is the nature and implementation of the cursors. The cursor contains the key of the last processed entity (encoded), and so if you set a cursor to your query before executing it, the Datastore will jump to the entity specified by the key encoded in the cursor, and will start listing entities from that point.
Let's examine your case
Your query filter is Value<2. You iterate over the entities of the query result, and you change (and save) the Value property to 2. Note that Value=2 does not satisfy the filter Value<2.
In the next iteration (next batch) a cursor is present which you apply properly. Therefore when the Datastore executes the query, it jumps to the last entity processed in the previous iteration, and wants to list entities that come after this. But the entity pointed by the cursor may already not satisfy the filter; because the index entry for its new Value 2 will most likely be already updated (non-deterministic behavior - see eventual consistency for more details which applies here because you did not use an Ancestor query which would guarantee strongly consistent results; the time.Sleep() delay just increases the probability of this).
So the Datastore sees that the last processed entity does not satisfy the filter and will not search all the entities again but report that no more entities are matching the filter, hence no more entities will be updated (and no errors wil be reported).
Suggestion: don't use cursors and filter or sort by the same property you're updating at the same time.
By the way:
The part from the Appengine docs you quoted:
Cursors don't always work as expected with a query that uses an inequality filter or a sort order on a property with multiple values.
This is not what you think. This means: cursors may not work properly on a property which has multiple values AND the same property is either included in an inequality filter or is used to sort the results by.
By the way #2
In the screenshot you are using SDK 1.9.17. The latest SDK version is 1.9.21. You should update it and always use the latest available version.
Alternatives to achieve your goal
1) Don't use cursors
If you have many records, you won't be able to update all your entities in one step (in one loop), but let's say you update 300 entities. If you repeat the query, the already updated entities will not be in the results of executing the same query again because the updated Value=2 does not satisfy the filter Value<2. Just redo the query+update until the query has no results. Since your change is idempotent, it would not cause any harm if the update of the index entry of an entity is delayed and would get returned by the query multiple times. It would be best to delay the execution of the next query to minimize the chance of this (e.g. wait a few seconds between redoing the query).
Pros: Simple. You already have the solution, just exclude the cursor handling part.
Cons: Some entities might get updated multiple times (therefore the change must be idempotent). Also the change performed on entities must be something which will exclude the entity from the next query.
2) Using Task Queue
You could first execute a keys-only query and defer the update to using tasks. You could create tasks with let's say passing 100 keys to each, and the tasks could load the entities by key and do the update. This would ensure each entity would only get updated once. This solution would have a little bigger delay due to involving the task queue, but that is not a problem in most cases.
Pros: No duplicated updates (therefore change may be non-idempotent). Works even if the change to be performed would not exclude the entity from the next query (more general).
Cons: Higher complexity. Bigger lag/delay.
3) Using Map-Reduce
You could use the map-reduce framework/utility to do massively parallel processing of many entities. Not sure if it has been implemented in Go.
Pros: Parallel execution, can handle even millions or billions of entities. Much faster in case of large entity number. Plus pros listed at 2) Using Task Queue.
Cons: Higher complexity. Might not be available in Go yet.
I have an index with something like 60-100 Million documents. We almost always query these documents (in addition to other filter queries and field queries, etc) on a foreign key id, to scope the query to a specific parent object.
So, for example: /solr/q=*:*&fq=parent_id_s:42
Yes, that _s means this is currently a solr.StrField field type.
My question is: should I change this to a TrieIntField? Would that speed up performance? And if so, what would be the ideal precisionStep and positionIncrementGap values, given that I know that I will always be querying for a single specific value, and that the cardinality of that parent_id is in the 10,000-100,000 (maximum) order of magnitude?
Edit for aditional detail (from comment on an answer below):
The way our system is used, it turns out that we end up using that same fq for many queries in a row. And when the cache is populated, the system runs blazing fast. When the cache gets dumped because of a commit, this query (even a test case with ONLY this fq) can take up to 20 seconds. So I'm trying to figure out how to speed up that initial query that populates the cache.
Second Edit:
I apologize, after further testing it turns out that the above poor performance only happens when there are also facet fields being returned (e.g. stuff like &facet=true&facet.field=resolved_facet_facet). With a dozen or so of these fields, that's when the query takes up to 20-30 seconds sometimes, but only with a fresh searcher. It's instant when the cache is populated. So maybe my problem is the facet fields, not the parent_id field.
TrieIntField with a precisionStep is optimized for range queries. As you're only searching for a specific value your field type is optimal.
Have you looked at autowarming queries? These run whenever a new IndexSearcher is being created (on startup, on an index commit for example), so that it becomes available with some cache already in place. Depending on your requirements, you can also set useColdSearcher flag to true, so that the new Searcher is only available when the cache has been warmed. For more details have a look here: https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig#QuerySettingsinSolrConfig-Query-RelatedListeners
It sounds like you probably aren't getting much benefit from the caching of result sets from the filter. One of the more important features of filters is that they cache their result sets. This makes the first run of a certain filter take longer while a cache is built, but subsequent uses of the same filter are much faster.
With the cardinality you've described, you are probably just wasting cycles, and polluting the filter cache, by building caches without them ever being of use. You can turn off caching of a filter query like:
/solr/q=*:*&fq={!cache=false}parent_id_s:42
I also think filter query doesn't help in this case.
q=parent_id_s:42 is to query the index by the term "parent_id_s:42" and get a set of document ids. Since the postings (document ids) are indexed by the term, and assuming you have enough memory to hold this (either in JVM or OS cache), then this lookup should be pretty fast.
Assuming filter cache is already warmed up and you have 100% hit ratio, which one of the following is faster?
q=parent_id_s:42
fq=parent_id_s:42
I think they are very close. But I could be wrong. Anyone knows? Any know ran performance test for this?
Say I have a query that returns 10,000 records. When the first record has returned what can I assume about the state of my query?
Has it finished and is just returning records from the server to my instance of SSMS?
Is the query itself still being executed on the server?
What is it that causes the 10,000 records to be slowly returned for one query and nearly instantly for another?
There is potentially some mix of progressive processing on the server side, network transfer of the data, and rendering by the client.
If one query returns 10,000 rows quickly, and another one slowly -- and they are of similar row size, data types, etc., and are both destined for results to grid or results to text -- there is little we can do to analyze the differences unless you show us execution plans and/or client statistics for each one. These are options you can set in SSMS when running a query.
As an aside, switching between results to grid and results to text you might notice slightly different runtimes. This is because in one case Management Studio has to work harder to align the columns etc.
You can not make a generic assumption, a query's plan is composed of a number of different types of operations, or iterators. Some of these are Navigational based, and work like a pipeline, whilst others are set based operations, such as a sort.
If any query contains a set based operation, it requires all the records before it could output the results (i.e an order by clause within your statement.) But if you have no set based iterators you could expect the rows to be streamed to you as they become available.
The answer to each of your individual questions is "it depends."
For example, consider if you include an order by clause, and there isn't an index for the column(s) you're ordering by. In this case, the server has to find all the records that satisfy your query, then sort them, before it can return the first record. This causes a long pause before you get your first record, but you (should normally) get them quite quickly once you start getting any.
Without the order by clause, the server will normally send each record as its found, so the first record will often show up sooner, but you may see a long pause between one record and the next.
As as far simply "why is one query faster than another", a lot depends on what indexes are available, and whether they can be used for a particular query. For example, something like some_column like '%something' will almost always be quite slow. The leading '%' means this won't be able to use an index, even if some_column has one. A search for something% instead of %something% might easily be 100 or 1000 times faster. If you really need the former, you really want to use full-text searching instead (create a full-text index, and use contains() instead of like.
Of course, a lot can also depend simply on whether the database has an index for a particular column (or group of columns). With a suitable index, the query will usually be quite a lot faster.