My frontend client written in Angular 8, occasionally throws a 504 Gateway error for my api requests. My API is hosted in AppEngine Standard env, with a min of 5 instances always active.
Backend API written in nodejs + express.
Since its a 504 gateway timeout, none of the errors are captured by the backend logging.
How can I debug this to find out what's causing my errors?
This is usually due to a long running process from the app engine that doesn't finish before the configured timeout. Best way that I can think to debug this is to add more logs in your processes and see if something is taking longer than usual when you get the 504.
Ideally you want to offload longer running processes to another product like Cloud Tasks that has a longer timeout period so that App Engine can send a response quickly and have the processing done in the background.
Related
I have a React application that I have been trying to run on GKE for weeks now but I cannot figure out the the GKE Ingress. There are a total of 7 microservices running including the React App.
My React App makes 4 API calls in total
"/posts/create" //creates a new post
'/posts/comments/*' //adds a comment to a post
'/posts' // gets posts+comments, returns empty object since no posts are created
'/posts/save' // saves post to cloudSQL
The application uses an event bus that handles communication between the different microservices so I created a ClusterIP service for each app and created additional NodePort services to use on the Ingress. After the Ingress is created I can access the React App but it says all of the backend services are unhealthy and I can't access them. I have tried calling the API's in several ways through the React Client including (calls // error in Chrome console
"http://query-np-srv:4002/posts" //Failed to load resource: net::ERR_NAME_NOT_RESOLVED
"http://10.96.11.196:4002/posts"(this is the endpoint for the service) //xhr.js:210 GET http://10.96.11.196:4002/posts net::ERR_CONNECTION_TIMED_OUT
"http://posts.com/posts // GET http://posts.com/posts 502 (Bad Gateway)
If i run any of the follwoing commands from the client pod I get an object returned as intended
curl query-srv:4002/posts
curl 10.96.12.242:4002/posts
curl query-np-srv:4002/posts
The only way I have been able to get this application to actually work on GKE is by exposing the client, posts, comments, and query pods on LoadBalancers and hard coding the LB IP's into the API calls, which cannot be a best practice. At least this way I know the project is functional and leads me to believe this is an ingress issue
Here is my Github repo for project
All of the yaml files are located in the infra/k8s folder and I am using the test.yaml to deploy the ingress, not the ingress-srv.yaml. Also, I am not using skaffold to deploy so that can be ignored as it is not causing the issues. If anyone can figure this out I would be very appreciative.
If after you create the ingress object the backends services are unhealthy, you need to review your Health checks. Did you review if GKE created Health checks for each backend service?
Health checks connect to backends on a configurable, periodic basis.
Each connection attempt is called a probe. Google Cloud records the
success or failure of each probe. Google Cloud considers backends to
be unhealthy when the unhealthy threshold has been met. Unhealthy
backends are not eligible to receive new connections; however,
existing connections are not immediately terminated. Instead, the
connection remains open until a timeout occurs or until traffic is
dropped.
I'm running a Django application on Google App Engine and I noticed logs coming from /_ah/api/discovery/v1/apis that result in a 502
Does anyone know what this means and where it comes from? I can not find anything about fuzzing in the GAE docs.
Reviewing for information about the error message, I found that Cloud Endpoints v1 is no longer supported and was shut down on August 2, 2018.
As recommended, here is the link to Migrating to Cloud Endpoints Frameworks version 2.0.
In the case of the 502 error code, App Engine may take a few minutes to reply to queries satisfactorily. If you send a request and receive an HTTP 502 response, wait a minute and try again.
App Engine terminated the program because it ran out of memory, according to an error code 502 with BAD GATEWAY in the message. Only 600MB of memory is available for the application container in the basic App Engine flexible VM.
If the error message was regarding a bad gateway, here is how to troubleshoot the 502 error code.
I'm having a nodejs script which starts a stream with a third party and stores the incoming messages in FireStore.
There is no need for incoming requests. But after I deployed my script to App Engine, the script only starts if I call the cloud endpoint. After that, it keeps running (and that is what it should do).
Probably there is a way to start processes by default and also build in something like a auto-restart if it crashes, but I couldn't find it or I am using the wrong search terms :-)
AppEngine is a web-microservice platform. I mean that all (micro) service deployed have to be triggered by an HTTP request.
By the way, you can perform an infinite batch process which stream data.
However, you can set up a Cloud Task which call an AppEngine endpoint. The max duration is 24H. Link this to a Cloud Scheduler to launch every day your 24H-long task. (In detail, your cloud scheduler has to trigger an endpoint like Cloud Function or AppEngine. This endpoint creates the task in Cloud Task. Cloud Scheduler can't directly create a task in Cloud Task)
As Guillaume mentioned, GAE isn't really intended for implementing services like the ones you want to.
However, it's possible to do something similar, simply by configuring a minimum 1 idle instance:
GAE will start an idle instance for the service automatically, without waiting for a triggering request
when the idle instance dies accidentally or is terminated because it reaches the end of its allowed lifespan GAE will again start a new idle instance
when the 1st request comes in GAE will dispatch it to the idle instance, that instance thus becoming active (serving subsequent requests) and GAE will immediately start a new idle instance to have it on standby
when the only active instance dies GAE won't start a new instance immediately, it'll wait until a new request comes in, which will be like the 1st request
when traffic is high enough GAE will start dispatching it to the idle instance on standby activating it and again start a new idle instance on standby.
I have an issue with our Python 2.7 Google App Engine project where after a lot of research it appears as though App Engine is repeatedly sending through the same request to our servers. Is anyone else experiencing this issue?
EDIT:
It's a standard request with all the same parameters as it was first sent with (some of which are 15 days old now). I'm wandering if it's a similar situation to this https://groups.google.com/forum/#!topic/google-appengine-downtime-notify/9fAYP7UyppQ
Typically App Engine doesn't send requests, it's the framework you use to handle requests.
If you're getting duplicate requests coming in, you should look to see where it's coming from. If it's coming from an app engine server to your own external server, then somebody is running some sort of process on their app engine account that's accessing data from your server during its execution.
The details, including what you would do about it, are all situation-dependent. There's no catch-all response. Just be aware that, yes, any computer on the Internet can make web requests to any other. And that includes app engine (in both directions).
My Google App Engine application returns 503 codes to clients when they should try later to receive data from the server. This appears to cause app engine to think the application is failing and after several such responses the instance is restarted, adding to average latency. Is there a way to prevent app engine from restarting instances just because you manually return 503 or other non-200 http response codes? TIA!
Edit 1:
Here's a screenshot of how it typically goes (with some app-specific stuff ablated due to the sensitive nature of my app). Note that all the [I]nformational and [D]ebug messages are generated by my code, while the [W]arning about the restart is obviously GAE itself. The only thing that distinguishes the times this happens is when I return a 503.
App Engine doesn't base decisions to terminate or restart instances based on the status codes you return, or on what you log.
It seems App Engine now does base instancing decisions on your return code. To the best of my knowledge, there's no way to return a 5xx code, but tell App Engine that nothing's really wrong.