502 errors for all endpoint requests without changing anything - google-app-engine

We just started having lots of 502 errors out of the blue, without deploying anything new. Somehow 99% of all requests to the endpoints don't get through to appengine (as seen in the appengine log). The service status of google app engine and endpoints seems to be green.
We tried deploying a new endpoints api description and a new appengine version using it, also stopping respective versions.
We can also no longer look at the api explorer.
web requests via the gapi js library return "Error 502 (Server Error)!!1" when trying to initialize and load the "_ah/api/static/proxy.html" page
What could be the problem here? Is there a way to "restart" endpoints?

OK, its just magically started working again after around 50min of downtime. I guess it would still be interesting to know if there is anything we could do in cases like this.

Related

Getting inconsistent 500 error - inconsistent Error code 204 on Google App Engine PHP Standard

Recently one of our sites got suspended by Google Ads due to "Destination Not Working". When I talked with Google Support they told me that my site is not accessible from all location globally. Then I tried to investigate, the site is hosted on Google App Engine. And I didn't find any 500 errors. But sometimes some website checking tools like "Uptrends" showed me inconsistent error "Http Protocol Error"/500 error. Then I tried to see closely on Google stackdriver logging and ran several tests on Uptrends and on other tools. But I saw something like this.
And on App Engine logging, I saw something like -
And also sometimes some HTTP request is not hitting my app so my app logging is not working and it's bothering us so much. We are losing tons of our marketing budget due to this facts. So it would be great if anybody come forward and tell me any clue to test and help me to investigate.
204s mostly happen because of RAM issue, so boosting to a bigger instance type usually clears these up
https://issuetracker.google.com/issues/35900014
I've gotten a 204 before and it was because there was a memory leak in app engine's ssl library. I was passing it the string to a cert file and it wasn't closing those files. The work-around to fix it was to handle the opening and closing of the file myself and pass it the file handle instead.
If you pay for Google Cloud Support, they may be able to help dig into things that are not visible to you.

Secured endpoints and redoploy issue

I played around with endpoints and noticed that after the deployment of a new appengine version and making that one the default all existing logged in clients started to get unauthorized exceptions.
What am I missing? Is this the way it is supposed to work?

Google App Engine request being executed multiple times

I have an issue with our Python 2.7 Google App Engine project where after a lot of research it appears as though App Engine is repeatedly sending through the same request to our servers. Is anyone else experiencing this issue?
EDIT:
It's a standard request with all the same parameters as it was first sent with (some of which are 15 days old now). I'm wandering if it's a similar situation to this https://groups.google.com/forum/#!topic/google-appengine-downtime-notify/9fAYP7UyppQ
Typically App Engine doesn't send requests, it's the framework you use to handle requests.
If you're getting duplicate requests coming in, you should look to see where it's coming from. If it's coming from an app engine server to your own external server, then somebody is running some sort of process on their app engine account that's accessing data from your server during its execution.
The details, including what you would do about it, are all situation-dependent. There's no catch-all response. Just be aware that, yes, any computer on the Internet can make web requests to any other. And that includes app engine (in both directions).

Is there a way to prevent Google App Engine from restarting instances due to (faux) http errors?

My Google App Engine application returns 503 codes to clients when they should try later to receive data from the server. This appears to cause app engine to think the application is failing and after several such responses the instance is restarted, adding to average latency. Is there a way to prevent app engine from restarting instances just because you manually return 503 or other non-200 http response codes? TIA!
Edit 1:
Here's a screenshot of how it typically goes (with some app-specific stuff ablated due to the sensitive nature of my app). Note that all the [I]nformational and [D]ebug messages are generated by my code, while the [W]arning about the restart is obviously GAE itself. The only thing that distinguishes the times this happens is when I return a 503.
App Engine doesn't base decisions to terminate or restart instances based on the status codes you return, or on what you log.
It seems App Engine now does base instancing decisions on your return code. To the best of my knowledge, there's no way to return a 5xx code, but tell App Engine that nothing's really wrong.

Where are the GAE Backends logs?

As soon as I figured out how to invoke my app as a Backends app, my log messages stopped appearing.
I found this helpful post which says the Backends logs are kept separately and that I need to switch to Backends view by selecting the named backend from the dropdown at the top of the admin console, but I don't see anything Backends specific in that list to select.
Perhaps this is a clue... when I invoke my app via myapp.appspot.com/dostuff, in the log I see I'm getting the DeadlineExceededError after 60 secs, indicating it's not running as a Backends app. But when I invoke it via mybackend.myapp.appspot.com/dostuff, it continues running as needed, but no log entries!
Seems like I'm missing something. Thanks.
Somehow my app wasn't fully recognized by GAE as a Backend app. And therefore my backend instances weren't available in the drop-list. Seeing that others had similar trouble with a Python 2.7 app (ex.) that had been initially uploaded as a 2.5 app, I created a new app in GAE and uploaded to there and it worked. Sorry I don't have a more definitive answer.
Before all was working well, I also ended up creating an empty /_ah/start handler as suggested by someone in this thread. Also if you're deploying a multi-threaded backend, make sure to check this post out -- there's some important stuff I didn't run across in the docs.

Resources