gcs.listbucket() method behaves in a weird fashion: sometimes it seems that the bucket is empty when it is not: calling this handler several times doen't always returns the same list: sometimes it behaves correctly and sometimes it returns an iterator with zero items.
class MainHandler(webapp2.RequestHandler):
def get(self):
bucket_name = '/my-non-empty-bucket'
bucket_images = gcs.listbucket(bucket_name)
self.response.write('<br> '.join([b.filename for b in bucket_images]))
It seems that listbucket is timed out sometimes, I don't understand.
my retry parameters:
my_default_retry_params = gcs.RetryParams(initial_delay=0.2,
max_delay=5.0,
backoff_factor=2,
max_retry_period=15,
urlfetch_timeout=10)
gcs.set_default_retry_params(my_default_retry_params)
The log in app engine seems perfectly fine, no errors.
Any suggestion about how to further understand this behaviour wuld be appreciated
Related
Based on the documentation for Objectify and Google Cloud Datastore, I would expect the queries and the batch loads in the following code to execute in parallel:
List<Iterable<Key<MyType>>> results = new ArrayList<>();
for (...) {
results.add(ofy().load()
.type(MyType.class)
.filter(...)
.keys()
.iterable());
}
...
Iterable<MyType> keys = ...;
Collection<MyType> c = ofy().load().keys(keys).values();
But the trace makes it look like each query and each entity load executes in sequence:
What gives?
It looks like this only happens when doing a cached get from Memcache. With similar code I see the expected async behavior for datastore_v3.Get/Put/Delete:
It seems the reason for this is that Objectify doesn't use AsyncMemcacheService. Indeed, there is an open issue for this on the project page, and this can also be confirmed by checking out the source and doing a grep -r AsyncMemcacheService.
Regarding the serial datastore_v3.RunQuery calls, calls to ofy().load().type(...).filter(...).iterable() are 'asynchronous' in that they return immediately, however the actual Datastore queries themselves get executed serially as the App Engine Datastore API doesn't expose an explicitly async API for queries.
I'm using the NoseGAE to write local unit tests for my App Engine application, however something is suddenly going wrong with one of my tests. I have standard setUp and tearDown functions, but one test seemingly broke for a reason I can't discern. Even stranger, setUp and tearDown are NOT getting called each time. I added global variables to count setUp/tearDown calls, and on my 4th test (the now seemingly broken one), setUp has been called twice and tearDown has been called once. Further, one of the objects from the third test exists when I query it by id, but not in a general query for its type. Here's some code that gives the bizarre picture:
class GameTest(unittest.TestCase):
def setUp(self):
self.testapp = webtest.TestApp(application)
self.testbed = testbed.Testbed()
self.testbed.activate()
self.testbed.init_datastore_v3_stub(
consistency_policy=datastore_stub_util.PseudoRandomHRConsistencyPolicy(probability=1),
require_indexes=True,
root_path="%s/../../../" % os.path.dirname(__file__)
)
def tearDown(self):
self.testbed.deactivate()
self.testapp.cookies.clear()
def test1(self):
...
def test2(self):
...
def test3(self):
...
# I create a Game object with the id 123 in this particular test
Game(id=123).put()
...
def test4(self):
print "id lookup: ", Game.get_by_id(123)
print "query: ", Game.query().get()
self.assertIsNone(Game.get_by_id(123))
This is an abstraction of the tests, but illustrates the issue.
The 4th test fails because it asserts that an object with that id does not exist. When I print out the two statements:
id lookup: Game(key=Key('Game', 123))
query: None
The id lookup shows the object created in test3, but the query lookup is EMPTY. This makes absolutely no sense to me. Further, I am 100% sure the test was working earlier. Does anyone have any idea how this is even possible? Could I possibly have some local corrupted file causing an issue?
I somewhat "solved" this. This issue only reproduced when I had other test cases in other files that were failing. Once I solved those, all my tests passed. I still don't fully understand why other failing tests should cause these bizarre issues with the testbed, but to anyone else having this issue, try fixing your other test cases first and see if that doesn't cause it to go away.
My program relies on the NDB context cache so that different ndb.Key.get() calls will receive the same model instance.
However, I discovered that this doesn't work properly with asynchronous gets. The expected behavior is that NDB's batcher combines the requests and return the same model instance but that doesn't happen.
The problem only occurs when memcache is enabled which is also strange.
Here is a test case (run it twice):
class Entity(ndb.Model):
pass
# Disabling memcache fixes the issue
# Entity._use_memcache = False
entity_key = ndb.Key('Entity', 1)
# Set up entity in datastore and memcache on first run
if not entity_key.get():
entity = Entity(key=entity_key)
entity.put()
return
# Clear cache after Key.get() above
ndb.get_context().clear_cache()
# Entity is now in memcache and datastore but not context
entity_future_a = entity_key.get_async()
entity_future_b = entity_key.get_async()
entity_a = entity_future_a.get_result()
entity_b = entity_future_b.get_result()
# FAILS
assert entity_a is entity_b
So far I have only tested this on the local SDK.
It is possible that this is happening because you are not calling yield in there. Can you try setting up the environment so you can use
entity_a, entity_b = yield entity_future_a, entity_b_future
?
I have a simple AppEngine handler as follows:
class TestGS(webapp2.RequestHandler):
def get(self):
file_name = '/gs/ds_stats/testfile'
files.gs.create(file_name, mime_type='text/html')
with files.open(file_name, 'a') as file_handle:
file_handle.write("foo")
files.finalize(file_name)
However, when I call this handler, I get ExistenceError: ApplicationError: 105 at the line with files.open(....
This seems like a super simple scenario, and there's no indication at all as to why this is failing (especially since the files.gs.create right above it seems to have succeeded, though, is there any way to verify this?).
Looking through the source code, I see the following problems can cause this error:
if (e.application_error in
[file_service_pb.FileServiceErrors.EXISTENCE_ERROR,
file_service_pb.FileServiceErrors.EXISTENCE_ERROR_METADATA_NOT_FOUND,
file_service_pb.FileServiceErrors.EXISTENCE_ERROR_METADATA_FOUND,
file_service_pb.FileServiceErrors.EXISTENCE_ERROR_SHARDING_MISMATCH,
file_service_pb.FileServiceErrors.EXISTENCE_ERROR_OBJECT_NOT_FOUND,
file_service_pb.FileServiceErrors.EXISTENCE_ERROR_BUCKET_NOT_FOUND,
]):
raise ExistenceError()
That's a pretty large range of issues... Of course it doesn't tell me which one! And again, strange that the 'create' seems to work.
The problem turned out to be a lack of clarity in the documentation. files.gs.create returns a special 'writeable file path' which you need to feed in to open and finalize. A correct example looks like this:
class TestGS(webapp2.RequestHandler):
def get(self):
file_name = '/gs/ds_stats/testfile'
writable_file_name = files.gs.create(file_name, mime_type='text/html')
with files.open(writable_file_name, 'a') as file_handle:
file_handle.write("foo")
files.finalize(writable_file_name)
I have implemented an AppEngine Java app. It runs just fine - except that i'm having way too many datastore read operations.
So I installed the appstats tool to analyze it. It showed up that on a single request i am doing in one point of my code:
Query query = persistenceManager.newQuery(Info.class,
":keys.contains(key)");
List<Info> storedInfos = (List<Info>) query.execute(keys);
That single call to execute(...) results in multiple datastore_v3.Get calls. I get this stack trace multiple times:
com.google.appengine.tools.appstats.Recorder:297 makeAsyncCall()
com.google.apphosting.api.ApiProxy:184 makeAsyncCall()
com.google.appengine.api.datastore.DatastoreApiHelper:59 makeAsyncCall()
com.google.appengine.api.datastore.AsyncDatastoreServiceImpl:351 doBatchGetBySize()
com.google.appengine.api.datastore.AsyncDatastoreServiceImpl:400 doBatchGetByEntityGroups()
com.google.appengine.api.datastore.AsyncDatastoreServiceImpl:292 get()
com.google.appengine.api.datastore.DatastoreServiceImpl:87 get()
com.google.appengine.datanucleus.WrappedDatastoreService:90 get()
com.google.appengine.datanucleus.query.DatastoreQuery:374 executeBatchGetQuery()
com.google.appengine.datanucleus.query.DatastoreQuery:278 performExecute()
com.google.appengine.datanucleus.query.JDOQLQuery:164 performExecute()
org.datanucleus.store.query.Query:1791 executeQuery()
org.datanucleus.store.query.Query:1667 executeWithArray()
org.datanucleus.api.jdo.JDOQuery:243 execute()
de.goddchen.appengine.app.InfosServlet:78 doPost()
It is even calling executeBatchGetQuery so why is this issued multiple times?
I have already tried out some datastore/persistencemanager settings but none helped :(
Any ideas?
You're probably seeing keys being broken into groups and the query being executing asynchronously. How many keys are you querying for?