Objectify queries and strange eventual consistency - google-app-engine

I'm seeing some strange behavior related to objectify and eventual consistency. I have noticed this behavior while running some integration tests which make HTTP requests to an App Engine Java development server.
As I have wanted those tests to also work when being run against the real app engine environment, they are dealing with eventual consistency by repeating requests which return results based on eventually consistent queries.
I previously had accidentally the ObjectifyFilter in the wrong location in web.xml, so that the ObjectifyFilter would not run. Now that I moved it to the start of the filter chain, so that it actually runs, all my queries seem to always return consistent results! No more eventual consistency, that is!
For example one test does the following:
Request which adds a user with some username
Request which tries to authorize user with username and password. This will make a global query for users with given username, and the query should be eventually consistent, but it always finds the user entity.
I have no clue what is happening.
More info:
I have checked that ofy().toString returns a different value for each request.
I'm using -Ddatastore.default_high_rep_job_policy_unapplied_job_pct=50
Appengine SDK version 1.8.6
I'm making all writes inside transactions

Disable eventual consistency in your tests. Adding retries and sleeps does not change the logic of your code, it just complicates testing. There's no point in trying to test around eventually consistent behavior; just be aware that it exists.
I don't know the answer to your specific question because it's really about the specific behavior of the test harness. Re-read the unit testing guide closely; unapplied jobs are applied at odd points like the second time a query is run. It's only a very rough approximation of the eventually consistent behavior of the server environment.

Related

Synchronicity and the Datastore in Google App Engine

I seem to be having a consistency problem with some of my data; i'm writing a unit test to see if a certain model has been placed in the datastore. It fails in the unit test unless I put a 5 second sleep before the return of the storing function.
I've been reading about asynchronous functions in gae, thinking that perhaps I need something along the lines of a promise so that the function will wait before returning until the data has been placed into the datastore. However, all the documentation on asynchronous versions of functions in GAE seem to imply that its non async functions already sort of act like promises in that way.
What does it mean for a function like put() to return? It seems to not mean that the data has been appropriately stored. Is there a way to wait until the data has been stored?
EDIT: My problem wasn't simply dealing with consistency, but that I was unsure of whether the problem was a consistency issue at all, and wanted instead to ask specifically about how the return of call to put() related to what was happening under the hood of GAE.
I think this question is similar to that listed, but is still useful to remain up because it approaches the consistency issue from a different perspective. If other people need to find this information, but aren't entirely sure of the phrasing as I was, or follow a similar train of thought as me, they may be able to reach the information through this question. It's also written more explicitly, with less domain specific terminology.
That being said, I do see the issue in terms of end-goal informational content; I would understand if it's taken down.
https://cloud.google.com/appengine/docs/java/datastore/#Java_Datastore_writes_and_data_visibility
Data writes happen in two stages, commit and apply. Commit records the transactions to a majority of replicas, and apply does two things in parallel: 1) writes the data and 2) writes the indexes.
Your unit test query may be executing on a replica that has a stale version of the data. The write operation returns immediately after the commit phase but the apply phase happens asynchronously. Ancestor queries are guaranteed to be up-to-date, however, so try testing by getting on the object key.

Controling eventual AppEngine datastory consistency during testing

I have an AppEngine app written in Go, and I'm trying to improve my tests.
Part of the tests that I need to run are a series of create, update, delete queries on the same object. However given that the datastore is eventually consistent (these aren't child objects), I am currently stuck using a time.Sleep(time.Second * 5) to give the simulated datastore in the SDK enough time for consistency to propagate.
This results in tests that take a long time to run. How can I force something more like strong consistency for tests without rewriting my code to use ancestor queries?
Have a look at the dev_server arguments. You will see there is an option for setting the consistency policy.
--datastore_consistency_policy {consistent,random,time}
the policy to apply when deciding whether a datastore
write should appear in global queries (default: time)
Notice the default is time, you want consistent
It's been a while, but the method that I found that works well is to call the context as follows:
c, err := aetest.NewContext(&aetest.Options{StronglyConsistentDatastore: true})

Writing to Datastore from Backends without shutting down

I am trying to write a program in Google App Engine (Python) to continually run a resident Backend which is working on finding what a series converges to. I want to make it so that it runs in the Backend, writes to Datastore, and at any point in time, you can tell what item the series is on and what value it is. The Backend only writes to one entity in Datastore, so it does not overload the storage or anything.The probably I run into though is that the Backend does not write the entity to the Datastore so it is accessible by my frontend webpage until the Backend is shut down, which defeats the purpose of being able to continually check in on it. If there is some way to have the Backend write to the Datastore so the frontend page can check in on it, please tell me!
Datastore writes in a backend process should behave no differently than writes in your front end app, meaning that they should be available for read in your front end (nearly) instantly (within consistency constraints). Both backend and front end interact with the same datastore.
It sounds like you just need to implement a recurring write of the current status of your series (ie. once every x cycles), instead of writing once at the end of the backend process.
You post suggests two issues.
The first is "without shutting down". We don't guarantee that backends will run indefinitely. See the docs on Shutdown for some details.
The second issue, if I'm understanding you, is that you're not seeing values written by the backend until some time after they're written. You may be running into "eventual consistency", were "eventual" is usually pretty short, but can an rare occasions be surprisingly long. Understanding Isolation and Consistency can help here.

writing then reading entity does not fetch entity from datastore

I am having the following problem. I am now using the low-level
google datastore API rather than JDO, that way I should be in a
better position to see exactly what is happening in my code. I am
writing an entity to the datastore and shortly thereafter reading it
from the datastore using Jetty and eclipse. Sometimes the written
entity is not being read. This would be a real problem if it were to
happen in production code. I am using the 2.0 RC2 API.
I have tried this several times, sometimes the entity is retrieved
from the datastore and sometimes it is not. I am doing a simple
query on the datastore just after committing a write transaction.
(If I run the code through the debugger things run slow enough
that the entity has a chance of being read back on the second pass).
Any help with this issue would be greatly appreciated,
Regards,
The development server has the same consistency guarantees as the High Replication datastore on the live server. A "global" query uses an index that is only guaranteed to be eventually consistent with writes. To perform a query with strongly consistent guarantees, the query must be limited to an entity group, using an "ancestor" key.
A typical technique is to group data specific to a single user in a group, so the user can see changes to queries limited to the user's group with strong consistency guarantees. Another technique is to use fancier client logic to update the client's local view as soon as the change is submitted, so the user sees the change in the UI immediately while the update to the global index is in progress.
See the docs on queries and transactions.

How does Google App Engine infrastructure is fault tolerant?

I am actually implementing a web application on Google App Engine. This has taken me for the moment a huge time in re-designing the database and the application through GAE requirements and best practices.
My problem is this: How can I be sure that GAE is fault tolerant, or at what degree is it fault tolerant? I didn't find any documents in GAE on this, and it is an issue that could have drawbacks for me: My app would have, for example, to read an entity from the datastore, compute it in the application, and then put it on the datastore. In this case how could we be sure that this would be correctly done and that we get the right data : if for example the machine on which the computing have be done crash ?
Thank you for your help!
If a server crashes during a request, that request is going to fail, but any new requests would be routed to a different server. So one user might see an error, but the rest would not. The data in the datastore would be fine. If you have data that needs to be kept consistent, you would do your updates in a transaction, so that either the whole set of updates was applied or none.
Transactions operating on the same entity group are executed serially, but transactions operating on different entity groups run in parallel. So, unless there is a single entity which everything in your app wants to read and write, scalability will not suffer from transactions.

Resources