I have task queue setup for Google App Engine in Java. It works for a long time already, but I just notice URI error in admin dashboard.
A problem was encountered with the process that handled this request,
causing it to exit. This is likely to cause a new process to be used
for the next request to your application. (Error code 202)
What caused this error?
I'm seeing this as well. Little progress so far, but I did find the following, which may help:
Forum thread: https://groups.google.com/forum/?fromgroups=#!topic/google-appengine/jufkxPik1Js
Issues: https://code.google.com/p/googleappengine/issues/detail?id=8560
I bet that the instance is running out of memory. Are you using appstats? It can consume a large amount of memory.
Related
I am doing a project in apache flink where I need to call multiple APIs so as to achieve my goal. The result of each API is required for the next API to work. Also as I am doing it on a KeyedStream, the same flow will be applicable to multiple data at once.
Below dig. can explain the scenario
/------API1---API2----
KeyedStream ----|------API1---API2----
\------API1---API2----
As I am doing all this, I am getting an exception saying "Buffer pool destroyed" after the job runs for sometime. Is it something related to API call, do I need to make use of Asynchronous function?? Please suggest. Thanks in advance.
a few things that are typically needed to help answer questions about Flink...
What version are you running?
How are you running it (from IDE, YARN cluster, stand-alone, etc)?
What's the complete stack trace for the exception?
(often) Can you share your code?
But at a high level, the "buffer pool destroyed" message you mentioned is not the root cause of failover, it's just a byproduct of Flink trying to kill off the workflow after an error has happened. So you need to dig deeper in the logs (typically Task Manager logs are where you'd look first).
I have recently changed to use custom Go runtime on GAE, and noticed many errors like this from logs:
internal.flushLog: Flush RPC: Call error 3: invalid security ticket: 6c8027dc99b3ed3e
internal.flushLog: Flush RPC: Canceled: (timeout)
The server is still running well, but I have no idea about that error, as well as why it happens.
I'm using a custom Go runtime by using Dockerfile, and App Engine Release is 1.9.37.
Any help to clarify the error would be highly appreciated. Thanks.
This is a known issue with the Go runtime on App Engine Flexible. It tends to happen when a line is logged right before the end of a request/response.
What happens is that when the line is logged it is actually put in a list of log lines to be batched together and sent to the application server as an RPC at periodic intervals. The security ticket is canceled at the end of a request/response which sometimes can happen before the log lines have been flushed. It's harmless, except that you may lose a log line or two. :\
We're actively working on fixing it.
Some requests silently fail in my python app, intermittently and unpredictably. The hallmarks of the failure are:
Request returns a 200, so the client doesn't know there's a problem.
Request does NOT successfully execute on the server.
No logging statements are recorded for the request.
Below is an example from my logs of a bunch of requests which are each supposed to write an entity to the datastore. You can see for the lower, successful request, a blue 'i' is present, indicating that info level logs were recorded. When I examine the datastore, an entity was successfully written for this request.
However, for the failed request, you can see there is just a white box, and there are no logging statements present at all. While the server returned a 200, no entity was written to the datastore for this request.
Has anyone encountered something like this before on App Engine? Any ideas on how to debug it? I've seen it in multiple different apps myself, but I've never been able to figure it out.
EDIT
To clarify, the main problem here is that code doesn't execute, as measured by the failure to write an entity. The spurious 200 and lack of logging is an associated symptom.
From a comment originally, but seems to be the resolution path for this issue:
Given that there are no log statements at all in the line and you appear to unpack the arguments and log them as soon as you enter the handler, this starts to look like an infrastructure/platform issue.
In such a case, it's best to open a public issue tracker issue, with "Type-Production" as a tag, including your app's app id and a timeframe, and as much information about your app and request handler involved as possible, and platform support will pick up the issue in the course of triage.
That said, it's worth examining the handler to make absolutely sure there's no way you could be exiting from the handler and sending a 200 without logging anything or seeing an exception. It all depends on what the code handling the request is capable of, what stack of libraries it's build upon, etc.
I had converted some tasks to run on a dynamic backend.
The tasks are failing silently [no logged error, no retry, nothing] ~20% of the time (min:10%, max:60%, sample:large, long term). Switching the task away from the backend restores retries and gets the failure rate back to ~0%.
Any ideas?
Converting it to a backend exacerbated the problem but wasn't the problem.
I had specified a task_retry_limit and the queue was a push queue. With a backend the number of instances is specified. (I believe you can replicate this issue on the frontend by ramping up requests rapidly, to a big number).
Tasks were failing 503: Instance Unavailable until they hit the task_retry_limit. This is visible temporarily in Task Queues, but will not show up in Logs.
I should be using pull queues. Even if my use case was stupid I'd probably +1 a task dying due to multiple 503: Instance Unavailable logging something so it doesn't appear like a phantom task.
Which runtime are you using on the backend?
Try running the backend for a bit without dynamic set to true and exercise the failing component.
On my project, I have seen tasks that target a static backend disappear on occasion, but no where near the rate you are seeing.
I have a request in AppEngine that takes a little while to complete (many seconds). Is there a way to detect whether the user or some network problem has already aborted the request? This would allow me to save myself the server-load of continuing the result generation, which won't go anywhere anyways.
I tried the following in Dev-Mode, but neither worked (haven't checked yet whether it behaves differently in production mode):
Checking whether resp.getOutputStream completes without throwing an IOException
Checking whether there was an Interrupt sent to the servlet thread
Thanks, Markus
PS: I am really specifically interested in this question, not in ways to restructure my app to make the request faster or prevent aborts or other things.
I don't know if that is possible at all on the App Engine, app engine doesn't allow in progress request. The response is sent to the client after that the handler/servlat has returned.
No, there is no way to detect this from inside the app. I wouldn't worry about it.
Way late but this may be useful. In Golang you can detect interrupts using the Context package.
Here is a useful video of Francesc Campoy explaining it:
https://www.youtube.com/watch?v=LSzR0VEraWw