AppEngine datastore entities stats not available. But it works locally. - google-app-engine

I have an app on appengine. It works fine locally. This morning, I deployed it and it doesn't work properly.
This is the app link, you can user username: qingWANG & password: wang123456 to log in as a teacher. Please forgive its ugly look, because it's just a prototype. When you log in, click reviewAssignment and in the new page you can see the homework students' uploads. You can try to click the score button, and you will see a 500 server error. The weird thing is, the first score button works fine, although it doesn't at the beginning; the others totally do not work.
When I check the datastore admin, it shows this:
I tested all these pages locally many times. But why does this error happen when I deploy?
Best Regards.

It's working for me. I am not getting any 500 errors...

Go to your dashboard and check the logs to see what is going on...

According to the stats page (https://cloud.google.com/appengine/docs/standard/java/datastore/stats)
"To keep the overhead of storing and updating the statistics reasonable, Cloud Datastore progressively drops statistics entities"
So when having lots of namespaces or entities, datastore statistics become unavailable.
Thus at some point sharded counters (https://cloud.google.com/appengine/articles/sharding_counters) must be used.

I deployed my app to Server, then I got this situation too.
No matter what I do, I can save an Entity and get it by Key. But I always failed by query it.
Two hours later, magical thing happened. I came back to the console and fresh the data... they are here. Every data I saved into database was there.
... Obviously, you don't need to do anything to fix it. Just let it run a while longer. It will be fixed by itself.

Related

React development server is not functioning properly for 10% of users

I hope it's the right place to ask such question. I have a live react app which still runs in development mode since it requires frequent updates, and it seems that for a few users the website doesn't function properly while for the others it works perfectly.
For example, for a few users there is a button that just doesn't respond, although I created some code to log to the console that the button was pressed, and it does also for those users, but then the rest of the code just doesn't run.
For other users, there is some text that loaded from mongodb database, and for those users it reverses some of the words in a random fashion.
I'm really lost here and don't even know where to begin, since for me and most of the users everything is working fine.
Do you think it could be because the app still runs in development mode?
Recently I noticed that a few users received the message
the development server has disconnected. refresh the page if necessary.
I researched this and was suggested that I should upgrade my react-scripts. I did, and the problem vanished, but the other problems still persist.
I understand that it's like shooting in the dark and it's hard to give advices without knowing more information, but maybe you could point me in the right direction?
Thank you very much in advance

Backends logs depth

I have a long-running process in a backend and I have seen that the log only stores the last 1000 logging calls per request.
While this might not be an issue for a frontend handler, I find it very inconvenient for a backend, where a process might be running indefinitely.
I have tried flushing logs to see if it creates a new logging entry, but it didn't. This seems so wrong, that I'm sure there must be a simple solution for this. Please, help!
Thanks stackoverflowians!
Update: Someone already asked about this in the appengine google group, but there was no answer....
Edit: The 'depth' I am concerned with is not the total number of RequestLogs, which is fine, but the number of AppLogs in a RequestLog (which is limited to 1000).
Edit 2: I did the following test to try David Pope's suggestions:
def test_backends(self):
launched = self.request.get('launched')
if launched:
#Do the job, we are running in the backend
logging.info('There we go!')
from google.appengine.api.logservice import logservice
for i in range(1500):
if i == 500:
logservice.flush()
logging.info('flushhhhh')
logging.info('Call number %s'%i)
else:
#Launch the task in the backend
from google.appengine.api import taskqueue
tq_params = {'url': self.uri_for('backend.test_backends'),
'params': {'launched': True},
}
if not DEBUG:
tq_params['target'] = 'crawler'
taskqueue.add(**tq_params)
Basically, this creates a backend task that logs 1500 lines, flushing at number 500. I would expect to see two RequestLogs, the first one with 500 lines in it and the second one with 1000 lines.
The results are the following:
I didn't get the result that the documentation suggests, manually flushing the logs doesn't create a new log entry, I still have one single RequestLog with 1000 lines in it. I already saw this part of the docs some time ago, but I got this same result, so I thought I wasn't understanding what the docs were saying. Anyways, at the time, I left a logservice.flush() call in my backend code, and the problem wasn't solved.
I downloaded the logs with appcfg.py, and guess what?... all the AppLogs are there! I usually browse the logs in the web UI, I'm not sure if I could get a confortable workflow to view the logs this way... The ideal solution for me would be the one that is described in the docs.
My apps autoflush settings are set to the default, I played with them when at some time, but I saw that the problem persisted, so I left them unset.
I'm using python ;)
The Google docs suggest that flushing should do exactly what you want. If your flushing is working correctly, you will see "partial" request logs tagged with "flush" and the start time of the originating request.
A couple of things to check:
Can you post your code that flushes the logs? It might not be working.
Are you using the GAE web console to view the logs? It's possible that the limit is just a web UI limit, and that if you actually fetch the logs via the API then all the data will be there. (This should only be an issue if flushing isn't working correctly.)
Check your application's autoflush settings.
I assume there are corresponding links for Java, if that's what you're using; you didn't say.
All I can think that might help is to use a timed/cron script like the following to run every hour or so from you workstation/server
appcfg.py --oauth2 request_logs appname/ output.log --append
This should give you a complete log - I haven't tested it myself
I did some more reading and it seems CRON is already part of appcfg
https://developers.google.com/appengine/docs/python/tools/uploadinganapp#oauth
appcfg.py [options] cron_info <app-directory>
Displays information about the scheduled task (cron) configuration, including the
expected times of the next few executions. By default, displays the times of the
next 5 runs. You can modify the number of future run times displayed
with the -- num_runs=... option.
Based on your comment, I would try.
1) Write you own logger class
2) Use more than one version

App Engine backup never finishes only clue is failure in map reduce worker_callback

Over the last few weeks we have repeatedly failed on doing a complete backup of the data store using the datastore admin tool. We thought the issues had to do with quota errors we were running into so we switched our application from a free to a paid app and we still have problems.
Each time we are attempting to back up to the blobstore and what occurs is that the process never finishes. We see the backup in our Pending Backups list but it never actually completes. We only have a total of 43MB of data right now so we don't see it as a data transfer problem. Looking at our default Task Queues it shows that we have two pending tasks one is a call to /_ah/mapreduce/controller_callback and another is a call to /_ah/mapreduce/worker_callback
The worker_callback racks up its retry count and the only error clue we have is on the Previous Run tab it shows the last http response code to be 500. There is no error message, nothing shows up in our error logs, it just keeps trying over and over again.
We've been able to narrow the backup problems to a specific entity kind for a particular namespace but we can't figure out why that entity kind is failing whereas the others are not. The major difference is the entity kind has a large number of embedded entities, but if the app engine is able to read / put those entities we can't understand why it seems to be having problems backing it up. The particular namespace that the error occurs in has the largest data stored for that entity kind compared to the other namespaces we have setup.
We think if we can see what error is occurring in the worker_callback we may be able to figure out why the backup is failing, or what is wrong with our data that's preventing the backup. Is there something we need to setup / enable through settings / configuration files to give us more detailed information on the backup? Or is there some other avenue we should explore to figure out how to investigate/fix this problem?
I should mention we are using the Java SDK as well as Objectify V3 to work with the data store. We are also backing up data to the Blobstore.
Thank you.
Well with the app engine team's help we figured what the problem was and we worked around the issue. I want to give details in case anyone else runs into this problem.
From issue 8363 the app engine team indicated that from their logs they could see that the map reduce failed because of the large number of properties that our entity kind had. The specific entity kind that was causing the failure had a large number of variable properties that was generating errors when map reduce tried to write out a schema. They indicated that the solution on their end was to ignore entities that were like this in the backup to make it so the backup worked successfully.
What we did to work around the issue and make the backup work was change how we told objectify to store out data. The large number of properties were being created due to our use of the #embedded keyword on a HashMap() class member field. Since the embedded keyword breaks down classes into individual components it was generating a large number of properties. We switched the member field to be #serialized and then ran a conversion process to make it use the new serialized property. This made the backup / restore work again.
You can read more about the differences between embedded and serialized on objectify's website
snielson, would you mind opening an issue on our Public issue tracker here. Remember to add your Application ID so we can further debug this specific scenario.
Thanks!

Consecutive XML HTTP Requests seem to block on Google App Engine

I am working on an application on Google App Engine. Roughly this is what I do:
The user screen is split into 2 parts (actually 3, but lets leave that out for now). The left part (this takes upto 75% of the screen) has a document with some words highlighted. When one of these highlighted words are clicked the right part displays various meanings of it, example usage etc. The way this works is clicking the word send an XML HTTP Request to the server, where the sample usage(s)/meaning(s) are retrieved from the datastore. This data is returned and displayed.
My problem:
After I click on a few words consecutively, the application seems to "hang" - say, I click on 5 words in quick succession, clicking on the 6th word (or any word after that) doesn't replace the info regarding the 5th word on my right panel.
Since some data store columns (at least single valued properties) are indexed by default I'm guessing retrieval is not the bottleneck here. It is probably the requests.
Is such an issue known with the GAE? Any workarounds possible?
Kind of in a soup with this - the application was supposed to go live today. Urgent help required!
Thanks! :)
You're probably being limited to two simultaneous requests by your browser - not by appengine. If you click on a third link before the first two have had a chance to return, make sure your app can deal with requests returning for links that should no longer be displayed.
If you were hitting a limit on appengine, you'd see exceptions in your server logs. If you're not seeing those exceptions, it's probably a client-side issue.
Sorry for the late ack (for some reason I received a notification for the responses a day late, by which we had managed to fix a few things). It does look like the problem was at the data end - our code was doing some inserts, and it turns out you can't do too many of them quickly - the logs reported a transaction time-out error. The reason we couldn't spot it earlier in the logs was we were writing simply too much info out and this was buried in somewhere.
The clicks on the user-side were pulling data from this table.
Unfortunately, the GAE simulator doesn't simulate any timeout error - so even though we had tested with comparable volumes of data before deployment this error never happened during development.
Thanks again for your responses!
And yet again, I apologize for responding late.

Best Practice for seeing live data on the dev server?

Assumption: live/production web app suppresses errors being shown to end-users.
Suppose your tech support team wants to see live data but through the eyes of the development-side of the application (maybe you want to see what errors are occurring, or want to see when you've got an issue fixed using an end-user's data).
Right now we've got one database serving both the dev and live boxes (not my idea - I know it's gross).
Ideas?
Edit: Best/handy tools for implementing your suggestion?
We replicate the data back to a different database. Yes, there is a delay, but it keeps people hands out of the production servers. This also allows us to "hide" information that tech support (and other people for that matter) aren't supposed to see.
In addition to replicating data down, on production, we see who's logged into the application, and if it's a member of the company, send them to the real error page versus the happy kitten playing with a ball of yarn apologizing.
Back up and restore from live to dev on a regular basis (once, twice a day). It doesn't need to be realtime (as you might be entering data from the dev side anyway, which could cause problems).
If you have PCI or HIPAA data, make sure you don't put that in your dev environment -- that might break laws.
I generally like to have a 3-tier system for web development:
Development
Testing
Live
Most of the time testing is an exact copy of the live system, except that errors are turned on, when a new version is about to be moved live it's replaced with the new version BEFORE live is, to detect upgrade issues.
Development is completely separate from live, to allow for major changes to things like the database, or changes to the production environment.
I would firstly make errors are either emailed to someone with details of how the user got there or at minimum logged so you can watch the error log while you perform similar actions to see if you get the same messages in the log.
And yes, copying the database on the dev server/site is probably your only option. You don't want any changes made by the development team to live data and you'll probably also have changes that won't work with the production database at some point.
I wouldn't recommend doing a nightly copy as a developer might be in the middle of some new feature where they have added data and then it's erased that night. I usually copy the production database(s) to dev each time a major version is released. This also allows me to do speed testing with a lot of live data. On some systems I also change everyones password to a default so I can login easily as any user.
If your configuration permits it:
a. Add a logging function (if there isn't one already) to write messages of interest to a log file.
b. Run the unix command
tail -f < logfile.txt
which will stream the growing log file to your console.
http://www.monkey.org/cgi-bin/man2html?tail
If you have Windows, you might try this:
http://tailforwin32.sourceforge.net/

Resources