I am observing strange behavior of my tomcat server, it seems like tomcat is not writing response to the client fast enough. Here is what I am seeing:
When firing aound 200 requests at the same time at my tomcat server, my application logs shows that my servlet's doGet() finishes process the request in about 500ms. However, at the client side the average response time is about 30 seconds (which means client start seeing response from tomcat after 30 seconds)!
Does anyone have any idea about how come there are such long delay between the end of my servlet's process time and the time when client receives response?
My server is hosted on Rackspace VM.
Found the culprit. I observed that the hosting server was using abnormally high CPU usage for even for only few requests, so I attached JConsole to Tomcat and found that all my worker thread has high blocking count... and are constantly in blocking state. Looking at the stack trace the locking happened during JAXBContext instantiation. Digg further, the application creating JAXBContext, which is relatively expensive, for each request.
So in summary, the problem was caused by JAXBContext instantiation per thread. Solution was to ensure JAXBContext is created once per application.
Related
Google describes basic scaling like this:
I don't really have any other choice, as I'm using a B1 instance, so automatic scaling is not allowed.
That raises the question though, if I have an endpoint that takes a variable amount of time (could be minutes, could be hours), and I basically have to set an idle_timeout, will App Engine calculate idle_timeout from the time the request was made in the first place or when the app is done handling the request?
If the former is right, then it feels a bit unfair to have to guess how long requests will take when thread activity is a usable indicator of whether or not to initiate shutdown of an app.
Here you are mixing up two different terms.
idle_timeout is the time that the instance will wait before shutting down after receiving its last request
Request timeout is the amount of time that App Engine will wait for the return of a request from your app
As per the documentation:
Requests can run for up to 24 hours. A manually-scaled instance can
choose to handle /_ah/start and execute a program or script for many
hours without returning an HTTP response code. Task queue tasks can
run up to 24 hours.
I use GAE standard environment and cron service. I deploy my app and cron jobs successfully based on app.yaml and cron.yaml.
And, when I run cron jobs firstly. I got below logs from stackdriver:
This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.
A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application.
How do I solve this? thanks.
I am unable to update my frontends nor my backends. I get the error message 'Version is not ready'. This bug has persisted for coming up to 24 hours now. I have a task perpetually running in a queue. My best guess is that this task is stopping the update. I am unable to delete the task as it is perpetually running, nor can I delete the queue as I am unable to upload a new queue.yaml definition. The same task previously failed due to a maximum recursion error as I had a synchronous RPC within an asynchronous tasklet.
I'm pretty sure the fix will require someone from the GAE side forcibly resetting the task queue. Thus, this question would be more suitably directed to the GAE team with details about my app in a less public forum. Though, from what I can see, they do not allow direct support questions and suggest posting the question here. My follow up question, then, is when you have a GAE issue that requires action from the GAE team - how do you get hold of them (other than paying US$500/month for a premium support account)?
EDIT:
The task is/was meant to be running on a backend instance. I intended to shutdown all backend and frontend instances via the console assuming that they would cancel the task and restart themselves. But I found that only one frontend instance was running - no backends. After shutting down that frontend instance, the dashboard has reported that I have 0 instances running, yet the website is still serving and the task remains perpetually running.
EDIT:
Disabling the app stopped the task from running. After reenabling the app, I was able to update it. Though I am left with a ghost task in my queue.
If you have a stuck task queue job, I'd try disabling the queue and killing the instance running that job. If that doesn't work, I'd try disabling the app temporarily.
Configuration:
We have iPlanet web server which sits before WebSphere portal 6.1 cluster (2) deployed in Linux machines.
When user tries to copy a 10 GB file across file systems (NFS mounted), we are using java run time to copy the file across to a different NFS mount, hoping that it would be faster than using any other java libraries.
proc = rt.exec("cp " + fileName + " " + outFileName);
Application deployed is a JSF portlet application.
a) session timeout is 60 mins on the app server and the application
b) we have an Ajax call from the client page to keep the session alive
User receives HTTP 500 within 3 minutes, while our logs show that file is still copying. Not sure why WebSphere is sending HTTP 500?
After 10 minutes are so file is copied, and when he clicks on refresh he can proceed.
Not sure what is causing this HTTP 500.
WebContainer threads are not supposed to be used for long tasks.
He's getting 500 after 3 minutes because that is the time WebSphere decides the thread is hung.
What you should be doing is using a WorkManager to perform that long task and the client can poll to check the status of the task.
If you consider upgrading to WAS v8/v8.5 in the near future a good idea will be to use Asynchronous Servlets for that
The reason that your client receives an HTTP 500 error after a few minutes can happen for a few reasons. Without a stack trace and some relevant logging, it is impossible to know which component within WebSphere "woke up" after 3 minutes and stopped everything. It might be WebSphere's timeout setting for the Web Container thread pool, or it can be some other timeout - should be easily concluded from the logs.
To fix this, you can do one of the following:
Adjust the relevant timeout value (depending, again, on which timeout it is exactly).
Change your design so long-running tasks are executed in the background. You can use WebSphere's Work Manager API for that, or asynchronous beans / servlets.
Can we start a dynamic backend programatically? mean while when a backend is starting how can i handle the request by falling back on the application(i mean app.appspot.com).
When i stop a backend manually in admin console, and send a request to it, its not starting "dynamically"
Dynamic backends come into existence when they receive a request, and
are turned down when idle; they are ideal for work that is
intermittent or driven by user activity.
Resident backends run continuously, allowing you to rely on the state
of their memory over time and perform complex initialization.
http://code.google.com/appengine/docs/python/backends/overview.html
I recently started executing a long running task on a dynamic backend and noticed a dramatic increase in the performance of the frontends. I assume this was because the long running task was competing for resources with normal user requests.
Backends are documented quite thoroughly here. Backends have to be started and stopped with appcfg or the admin console, as documented here. A stopped backend will not handle requests - if you want this, you should probably be using the Task Queue instead.
It appears that a dynamic backend need not be explicitly stopped. The overvicew (http://code.google.com/appengine/docs/python/backends/overview.html) states that the billing for a dynamic backend stops 15 minutes after the last request is processed. So, if your app has a cron job, for example, that requires 5 minutes to complete, and needs to run every hour, then you could configure a backend to do this. The cost you'll incur is 15+5 minutes every hour, or 8 hours for the whole day. I suppose the free quota allows you 9 backend hours. So, this type of scenario would be free for you. The backend will start when you send your first request to it through a queue, and will stop 15 minutes after the last request you send is processed completely.