Questions about the behavior of getting Cursor of datastore - google-app-engine

I have a question about getting Cursor
Target function:
https://godoc.org/google.golang.org/appengine/datastore#Iterator.Cursor
As far as can be read from the following code, offset is set when getting a Cursor
https://github.com/golang/appengine/blob/master/datastore/query.go#L702-L705
When I checked the result when this function was executed with the stack trace of GCP console, Insights displays a warning
Issue: Use of offset in datastore queries.
Description: Your app made 1 remote procedure calls to datastore.query () and datastore.next () using offset.
Recommendation: Use cursor instead of offset.
Query Details
g.co/gae/datastore/offset 10
g.co/gae/datastore/skipped 10
offset affects performance and billing, I want to avoid this behavior
Is there a way to avoid using offset? Or is this the correct behavior?

From Offsets versus cursors:
Although Cloud Datastore supports integer offsets, you should avoid
using them. Instead, use cursors. Using an offset only avoids
returning the skipped entities to your application, but these entities
are still retrieved internally. The skipped entities do affect the
latency of the query, and your application is billed for the read
operations required to retrieve them. Using cursors instead of offsets
lets you avoid all these costs.
The q.offset you're referring to is an internal variable used for the Cursor implementation, it's not the explicit query offset that the above quote mentions.
So you should be fine using Cursor.

Related

Index on array element attribute extremely slow

I'm new to mongodb but not new to databases. I created a collection of documents that look like this:
{_id: ObjectId('5e0d86e06a24490c4041bd7e')
,
,
match[{
_id: ObjectId(5e0c35606a24490c4041bd71),
ts: 1234456,
,
,}]
}
So there is a list of objects on the documents and within the list there might be many objects with the same _id field. I have a handful of documents in this collection and my query that selects on selected match._id's is horribly slow. I mean unnaturally slow.
Query is simply this: {match: {$elemMatch: {_id:match._id }}} and literally hangs the system for like 15 seconds returning 15 matching documents out of 25 total!
I put an index on the collection like this:
collection.createIndex({"match._id" : 1}) but that didn't help.
Explain says execution time is 0 and says it's using the index but it still takes 15 seconds or longer to complete.
I'm getting the same slowness in nodejs and in compass.
Explain Output:
{"explainVersion":"1","queryPlanner":{"namespace":"hp-test-39282b3a-9c0f-4e1f-b953-0a14e00ec2ef.lead","indexFilterSet":false,"parsedQuery":{"match":{"$elemMatch":{"_id":{"$eq":"5e0c3560e5a9e0cbd994fa52"}}}},"maxIndexedOrSolutionsReached":false,"maxIndexedAndSolutionsReached":false,"maxScansToExplodeReached":false,"winningPlan":{"stage":"FETCH","filter":{"match":{"$elemMatch":{"_id":{"$eq":"5e0c3560e5a9e0cbd994fa52"}}}},"inputStage":{"stage":"IXSCAN","keyPattern":{"match._id":1},"indexName":"match._id_1","isMultiKey":true,"multiKeyPaths":{"match._id":["match"]},"isUnique":false,"isSparse":false,"isPartial":false,"indexVersion":2,"direction":"forward","indexBounds":{"match._id":["[ObjectId('5e0c3560e5a9e0cbd994fa52'), ObjectId('5e0c3560e5a9e0cbd994fa52')]"]}}},"rejectedPlans":[]},"executionStats":{"executionSuccess":true,"nReturned":15,"executionTimeMillis":0,"totalKeysExamined":15,"totalDocsExamined":15,"executionStages":{"stage":"FETCH","filter":{"match":{"$elemMatch":{"_id":{"$eq":"5e0c3560e5a9e0cbd994fa52"}}}},"nReturned":15,"executionTimeMillisEstimate":0,"works":16,"advanced":15,"needTime":0,"needYield":0,"saveState":0,"restoreState":0,"isEOF":1,"docsExamined":15,"alreadyHasObj":0,"inputStage":{"stage":"IXSCAN","nReturned":15,"executionTimeMillisEstimate":0,"works":16,"advanced":15,"needTime":0,"needYield":0,"saveState":0,"restoreState":0,"isEOF":1,"keyPattern":{"match._id":1},"indexName":"match._id_1","isMultiKey":true,"multiKeyPaths":{"match._id":["match"]},"isUnique":false,"isSparse":false,"isPartial":false,"indexVersion":2,"direction":"forward","indexBounds":{"match._id":["[ObjectId('5e0c3560e5a9e0cbd994fa52'), ObjectId('5e0c3560e5a9e0cbd994fa52')]"]},"keysExamined":15,"seeks":1,"dupsTested":15,"dupsDropped":0}},"allPlansExecution":[]},"command":{"find":"lead","filter":{"match":{"$elemMatch":{"_id":"5e0c3560e5a9e0cbd994fa52"}}},"skip":0,"limit":0,"maxTimeMS":60000,"$db":"hp-test-39282b3a-9c0f-4e1f-b953-0a14e00ec2ef"},"serverInfo":{"host":"Dans-MacBook-Pro.local","port":27017,"version":"5.0.9","gitVersion":"6f7dae919422dcd7f4892c10ff20cdc721ad00e6"},"serverParameters":{"internalQueryFacetBufferSizeBytes":104857600,"internalQueryFacetMaxOutputDocSizeBytes":104857600,"internalLookupStageIntermediateDocumentMaxSizeBytes":104857600,"internalDocumentSourceGroupMaxMemoryBytes":104857600,"internalQueryMaxBlockingSortMemoryUsageBytes":104857600,"internalQueryProhibitBlockingMergeOnMongoS":0,"internalQueryMaxAddToSetBytes":104857600,"internalDocumentSourceSetWindowFieldsMaxMemoryBytes":104857600},"ok":1}
The explain output confirms that the operation that was explained is perfectly efficient. In particular we see:
The expected index being used with tight indexBounds
Efficient access of the data (totalKeysExamined == totalDocsExamined == nReturned)
No meaningful duration ("executionTimeMillis":0 which implies that the operation took less than 0.5ms for the database to execute)
Therefore the slowness that you're experiencing for that particular operation is not related to the efficiency of the plan itself. This doesn't always rule out the database (or its underlying server) as the source of the slowness completely, but it is usually a pretty strong indicator that either the problem is elsewhere or that there are multiple factors at play.
I would suggest the following as potential next steps:
Check the mongod log file (you can confirm its location by running db.adminCmd("getCmdLineOpts") via the shell connected to the instance). By default any operation slower than 100ms is captured. This will help in a variety of ways:
If there is a log entry (with a meaningful duration) then it confirms that the slowness is being introduced while the database is processing the operation. It could also give some helpful hints as to why that might be the case (waiting for locks or server resources such as storage for example).
If an associated entry cannot be found, then that would be significantly stronger evidence that we are looking in the wrong place for the source of the slowness.
Is the operation that you gathered explain for the exact one that the application and Compass are observing as being slow? Were you connected to the same server and namespace? Is the explained operation simplified in some way, such as the original operation containing sort, projection, collation, etc?
As a relevant example that combines these two, I notice that there are skip and limit parameters applied to the command explained on a mongod seemingly running on a laptop. Are those parameters non-zero when running the application and does the application run against a different database with a larger data set?
The explain command doesn't include everything that an application would. Notably absent is the actual time it takes to send the results across the network. If you had particularly large documents that could be a factor, though it seems unlikely to be the culprit in this particular situation.
How exactly are you measuring the full execution time? Does it potentially include the time to connect to the database? In this case you mentioned that Compass itself also demonstrates the slowness, so that may rule out most of this.
What else is running on the server hosting the database? Is there a container or VM involved? Would the database or the underlying server be experiencing resource contention due to concurrency?
Two additional minor asides:
25 total documents in a collection is extremely small. I would expect even the smallest hardware to be able to process such a request without an index unless there was some complicating factor.
Assuming that match is always an array then the $elemMatch operator is not strictly necessary for this particular query. You can read more about that here. I would not expect this to have a performance impact for your situation.

Google Appengine Cursors

I'm using both ndb and search-api queries in my python appengine project.
The only official docs on cursors I can find:
https://cloud.google.com/appengine/docs/python/datastore/query-cursors
https://cloud.google.com/appengine/docs/python/search/cursorclass
Following things are unclear for me:
What is cursor time-to-live ? Can I expose year-old cursors ?
How would cursor pagination behave in case items are added/removed from original collection? (+ if cursor points to particular record, what happens if this record no longer exists?)
How does query ordering affect above?
Are there any fundamental differences between ndb and search-api cursors?
I'm answering from ndb perspective, I haven't use the search API. All quotes are from your first link.
For 1 and 3 (as ordering is considered part of the original query from cursors perspective):
To retrieve additional results from the point of the cursor, the
application prepares a similar query with the same entity kind,
filters, and sort orders, and passes the cursor to the query's
with_cursor() method before performing the retrieval
So it doesn't really matter how old the cursor is (i.e. how old its query is) since its original query must be restored for the cursor to be obtained.
For 2:
Cursors and data updates
The cursor's position is defined as the location in the result list
after the last result returned. A cursor is not a relative position in
the list (it's not an offset); it's a marker to which Cloud Datastore
can jump when starting an index scan for results. If the results for a
query change between uses of a cursor, the query notices only changes
that occur in results after the cursor. If a new result appears before
the cursor's position for the query, it will not be returned when the
results after the cursor are fetched. Similarly, if an entity is no
longer a result for a query but had appeared before the cursor, the
results that appear after the cursor do not change. If the last result
returned is removed from the result set, the cursor still knows how to
locate the next result.
When retrieving query results, you can use both a start cursor and an
end cursor to return a continuous group of results from Cloud
Datastore. When using a start and end cursor to retrieve the results,
you are not guaranteed that the size of the results will be the same
as when you generated the cursors. Entities may be added or deleted
from Cloud Datastore between the time the cursors are generated and
when they are used in a query.
The Java equivalent page at Limitations of cursors mentions some errors that can be raised for inconsistencies:
New App Engine releases might change internal implementation details,
invalidating cursors that depend on them. If an application attempts
to use a cursor that is no longer valid, Cloud Datastore raises an
IllegalArgumentException (low-level API), JDOFatalUserException
(JDO), or PersistenceException (JPA).
I suspect Python would be raising some similar errors as well.

Number Found Accuracy on Search API Affecting Cursor Results

When using the google app engine search API, if we have a query that returns a large result set (>1000), and need to iterate using the cursor to collect the entire result set, we are getting indeterminate results for the documents returned if the number_found_accuracy is lower than our result size.
In other words, the same query ran twice, collecting all the documents via cursors, does not return the same documents, UNLESS our number_found_accuracy is higher than the result size (ex, using the 10000 max). Only then do we always get the same documents.
Our understanding of how the number_found_accuracy is supposed to work is that it would only affect the number_found estimation. We assumed that if you use the cursor to get all the results, you would be able to get the same results as if you had run one large query.
Are we mis-understanding the use of the number_found_accuracy or cursors, or have we found a bug?
Your understanding of number_found_accuracy is correct. I think that the behavior you're observing is the surprising interplay between replicated query failover and how queries that specify number_found_accuracy affect future queries using continuation tokens.
When you index documents using the Search API, we store them in several distinct replicas using the same mechanism as the High Replication Datastore (i.e., Megastore). How quickly those documents become live on each replica depends on many factors. It's usually immediate, but the delay can become much longer if you're doing batch writes to a single (index, namespace) pair.
Searches can get executed on any of these replicas. We'll even potentially run a search that uses a continuation token on a different replica than the original search. If the original replica and/or continuation replica are catching up on their indexing work, they might have different sets of live documents. It will become consistent "eventually" but it's not always immediate.
If you specify number_found_accuracy on a query, we have to run most of the query as if we're going to return number_found_accuracy results. We specifically have to read much further down the posting lists to find and count matching documents. Reading a posting list results in its associated file block being inserted into various caches.
In turn, when you do the search using a cursor we'll be able to read the document for real much more quickly on the same replica that we'd used for the original search. You're thus less likely to have the continuation search failover to a different replica that might not have finished indexing the same set of documents. I think that the inconsistent results you've observed are the result of this kind of continuation query failover.
In summary, setting number_found_accuracy to something large effectively prewarms that replica's cache. It will thus almost certainly be the fastest replica for a continuation search. In the face of replicas that are trying to catch up on indexing, that will give the appearance that number_found_accuracy has a direct effect on the consistency of results, but in reality it's just a side-effect.

What is the cost difference between paging with a cursor or using offset?

When creating a results page with [Next Page] and [Prev Page] buttons, what is the cost difference between using a cursor to do this or using offset? And what are the pros and cons of each technique?
As a concrete example, what is the cost of reading results 100-110.
I have seen claims that offset uses "small datastore operations" and some that claim it uses a full "read operation" for each entity skipped.
Using cursors, I have read that they cannot page backwards, but I noticed a new Cursor.reverse() method for the first time today.
I assume that the disadvantages of using a cursor are that you cannot jump to a page by number e.g. straight to results 90-100.
Skipping results costs you a datastore small operation per skipped result. It's also slower than using cursors.
As you observe, reverse cursors are now available, which will allow you to page backwards, as long as the appropriate indexes for your query exist.
You can, of course, combine both cursors and offsets, if you want to skip to page 'n'.

Oracle stored procedure, returning ref cursor vs associative arrays

Our DBA requires us to return all tabular data from stored procedures in a set of associative arrays rather than using a ref cursor which is what I see in most examples on the web. He says this is because it is much faster for Oracle to do things this way, but it seems counter intuitive to me because the data needs to be looped over twice, once in the stored procedure and then again in the application when it is processes. Also, values often need to be casted from their native types to varchar so they can be stored in the array and then casted back on the application side. Using this method also makes it difficult to use orm tools because they seem to want ref cursors in most cases.
An example of a stored procedure is the following:
PROCEDURE sample_procedure (
p_One OUT varchar_array_type,
p_Two OUT varchar_array_type,
p_Three OUT varchar_array_type,
p_Four OUT varchar_array_type
)
IS
p_title_procedure_name VARCHAR2(100) := 'sample_procedure';
v_start_time DATE :=SYSDATE;
CURSOR cur
IS
SELECT e.one, e.two, e.three, e.four FROM package.table
WHERE filter='something';
v_counter PLS_INTEGER := 0;
BEGIN
FOR rec IN cur LOOP
BEGIN
v_counter := v_counter + 1;
p_One(v_counter) := rec.one;
p_Two(v_counter) := rec.two;
p_Three(v_counter) := rec.three;
p_Four(v_counter) := rec.four;
END;
END LOOP;
END;
The cursor is used to populate one array for each column returned. I have tried to find information supporting his claim that this is method faster but have been unable to do so. Can anyone fill me in on why he might want us (the .net developers) to write stored procedures in this way?
The DBA's request does not make sense.
What the DBA is almost certainly thinking is that he wants to minimize the number of SQL to PL/SQL engine context shifts that go on when you're fetching data from a cursor. But the solution that is being suggested is poorly targetted at this particular problem and introduces other much more serious performance problems in most systems.
In Oracle, a SQL to PL/SQL context shift occurs when the PL/SQL VM asks the SQL VM for more data, the SQL VM responds by executing the statement further to get the data which it then packages up and hands back to the PL/SQL VM. If the PL/SQL engine is asking for rows one at a time and you're fetching a lot of rows, it is possible that these context shifts can be a significant fraction of your overall runtime. To combat that problem, Oracle introduced the concept of bulk operations back at least in the 8i days. This allowed the PL/SQL VM to request multiple rows at a time from the SQL VM. If the PL/SQL VM requests 100 rows at a time, you've eliminated 99% of the context shifts and your code potentially runs much faster.
Once bulk operations were introduced, there was a lot of code that could be refactored in order to be more efficient by explicitly using BULK COLLECT operations rather than fetching row-by-row and then using FORALL loops to process the data in those collections. By the 10.2 days, however, Oracle had integrated bulk operations into implicit FOR loops so an implicit FOR loop now automatically bulk collects in batches of 100 rather than fetching row-by-row.
In your case, however, since you're returning the data to a client application, the use of bulk operations is much less significant. Any decent client-side API is going to have functionality that lets the client specify how many rows need to be fetched from the cursor in each network round-trip and those fetch requests are going to go directly to the SQL VM, not through the PL/SQL VM, so there are no SQL to PL/SQL context shifts to worry about. Your application has to worry about fetching an appropriate number of rows in each round-trip-- enough that the application doesn't become too chatty and bottleneck on the network but not so many that you have to wait too long for the results to be returned or to store too much data in memory.
Returning PL/SQL collections rather than a REF CURSOR to a client application isn't going to reduce the number of context shifts that take place. But it is going to have a bunch of other downsides not the least of which is memory usage. A PL/SQL collection has to be stored entirely in the process global area (PGA) (assuming dedicated server connections) on the database server. This is a chunk of memory that has to be allocated from the server's RAM. That means that the server is going to have to allocate memory in which to fetch every last row that every client requests. That, in turn, is going to dramatically limit the scalability of your application and, depending on the database configuration, may steal RAM away from other parts of the Oracle database that would be very useful in improving application performance. And if you run out of PGA space, your sessions will start to get memory related errors. Even in purely PL/SQL based applications, you would never want to fetch all the data into collections, you'd always want to fetch it in smaller batches, in order to minimize the amount of PGA you're using.
In addition, fetching all the data into memory is going to make the application feel much slower. Almost any framework is going to allow you to fetch data as you need it so, for example, if you have a report that you are displaying in pages of 25 rows each, your application would only need to fetch the first 25 rows before painting the first screen. And it would never have to fetch the next 25 rows unless the user happened to request the next page of results. If you're fetching the data into arrays like your DBA proposes, however, you're going to have to fetch all the rows before your application can start displaying the first row even if the user never wants to see more than the first handful of rows. That's going to mean a lot more I/O on the database server to fetch all the rows, more PGA on the server, more RAM on the application server to buffer the result, and longer waits for the network.
I believe that Oracle will begin sending results from a system like this as it scans the database, rather than retrieving them all and then sending them back. This means that results are sent as they are found, speeding the system up. (Actually, if I remember correctly it returns results in batches to the loop.) This is mostly from memory from some training
HOWEVER, the real question, is why not ask him his reasoning directly. He may be referring to a trick Oracle can utilize, and if you understand the specifics you can utilize the speed trick to it's full potential. Generally, ultimate of "Always do this, as this is faster" as suspicious and deserve a closer look to fully understand their intentions. There may be situations where this is really not applicable (small query results for example), where all the readability issues and overhead are not helping performance.
That said, it may be done to keep the code consistent and more quickly recognizable. Communication on his reasoning is the most important tool with concerns like this, as chances are good that he knows a trade secret that he's not full articulating.

Resources