Java instance memory usage going up without requests

Java instance memory usage going up without requests - google-app-engine

I have a strange problem with GAE Java. There are two instances with basic scaling for the version I am using, with one being used and the other idling, from what I can see in the log. Response times are fine. I can see that my idle instance did not receive any requests for the last hour. Strangely, on the idle instance the memory usage goes up constantly at around a 2MB/minute. For the last hour. The instance is using a google JDBC connection to a mysql cloud sql instance. I am using a DBCP 1.4 connection pool with 2 connections, but I don't think there would be any active processing being done, as a background thread should not even be possible on appengine.
It is on ca. 730MB for a B2 instance (256MB?) and will probably get restarted soon because of memory usage.
I am also using tracing on the connection (com.google.cloud.trace.instrumentation.jdbc 0.1.1) but again I dont think this will do anything as long as there are not queries.
How could this happen? And how could I find the memory leak? I think normally threads would be stopped after 30s. And the JDBC driver from google should not be somehow filling up memory by itself I would guess.

To answer my own question: It seems like its not related to JDBC at all. It seems to be a problem of the endpoint service control APIs:
Cloud endpoint management leaking memory?

Related

How do I determine the ideal value for my database.yml pool?

Longtime listener, first-time caller here... my Postgres database says it allows the following:
Connections:
6/120
What should my corresponding "pool" setting be in this scenario? 6? 120? Something else entirely? Thanks in advance for any help here.
If it makes a difference I'm using Puma & Sidekiq to run a Rails 4 application on Heroku.

How many connections does your app use under typical load? Set the idle pool to that, and set the max pool to somewhere under the max allowed by the server.
But, that server connections setting should also be tuned to your application and hardware. It's typically some function of your core count, RAM and work_mem setting, and the kind of disks you have, but will also depend on what kind of queries your app typically runs.
(see here for some tips: https://wiki.postgresql.org/wiki/Number_Of_Database_Connections)
Postgres is actually pretty forgiving: opening connections is cheap (an undersized pool), compared to many other databases; idle open connections (oversized pool) are also cheap (a few K of shared buffers, if memory serves).
It's really having more active connections than your resources allow that will cause problems, which is why the server-side configuration is more important.

Why is Google Appengine so slow connecting to CloudSQL

I am seeing a drastic difference in latency between development and production when connecting to a CloudSQL backend, much more so than I would expect.
I ran a test where:
I fetched 125, 250, 500, 1000 and 2000 rows (row size approximately 30bytes)
I fetched each row size 20 times, to get a good sampling of the time
The test was run in three environments:
Hosted appengine
Development mode locally, but connecting to CloudSQL via static IP
Development mode locally and connecting to a local VM running MySQL
Here you can see the results:
Now I would expect some speed fluctuations on the order of 50ms-200ms but 3-4 seconds seems a bit high.
I'm new to appengine, so any newb mistakes that might be causing this? Or other suggestions? I ran a profiler on my code in appengine and there is a call to _apiProxy.Event "wait" that eats up at least 500ms, but didn't go up more than 750ms, other than that, there was any long running calls. A number of shorter running calls that eventually add up of course, but it's not like I have a loop that needs to be tuned or anything.
Thanks in advance!

First off, check the connectivity path you are using: are you connecting via the latest documented method? Cloud SQL used to have a connectivity path which is slower and is now deprecated, but still functioning, so you could be accessing via that.
Second, is the App Engine app and Cloud SQL instance in the same location? Check that the "Preferred Location" in your Cloud SQL settings are set to follow the app engine app you are connecting to.
As a last possibility, which seems unlikely given that you have data connecting locally, make sure you are reusing database connections, making new ones can be expensive. If there was some reason why your app is reusing connections locally but making new ones on the App Engine side, that could create this behavior. But like I said, this one seems unlikely.

GAE Go - "This request caused a new process to be started for your application..."

I've encountered this problem for a second time now, and I'm wondering if there is any solution to this. I'm running an application on Google App Engine that relies on frequent communication with a website through HTTP JSON RPC. It appears that GAE has a tendency to randomly display a message like this in the logs:
"This request caused a new process to be started for your application,
and thus caused your application code to be loaded for the first time.
This request may thus take longer and use more CPU than a typical
request for your application."
And reset all variables stored in RAM without warning. The same process happens over and over no matter how many times I set the variables again or upload newer code to GAE, although incrementing the app version number seems to solve the problem.
How can I get more information on this behaviour, how to avoid it and prevent data loss of my Golang applications on Google App Engine?
EDIT:
The variables stored in RAM are small classes of strings, bytes, bools and pointers. Nothing too complicated or big.
Google App Engine seems to "start a new process" in matter of seconds of heavier use, which shouldn't be long enough time for the application to be shut down for not being used. The timespan between application being uploaded to GAE, having its variable set and a new process being created is less than a minute.

Do you realize that GAE is a cloud hosting solution that automatically manages instances based on the load? This is it's main feature and reason people are using it.
When load increases, GAE creates a new instance, which , of course, has all RAM variables empty.
The solution is not to expect variables to be available or store them to permanent storage at the end of request (session, memcache, datastore) and load them if not present at the beginnig of request.

You can read about GAE instances in their documentation here, check out the performance section:
http://code.google.com/appengine/kb/java.html
In your case of having small data available, if its static then you can load it into memory on startup of a new instance. If it's dynamic data, you should be saving it to the database using their api.
My recommendation for keeping a GAE instance alive, either pay for the Always-On service or follow my recommendations for using a cron here:
http://rwyland.blogspot.com/2012/02/keeping-google-app-engine-gae-instances.html
I use what I call a "prime schedule" of a 3, 7, 11 minute cron job.

You should consider using Backends if you want long running instances with resident memory.

Google AppEngine sending all requests to same instance

Lately, I have seen GAE taking much, much longer to process requests than it did just a week ago. Nothing changed in my code, but GAE now is taking 4000-12000ms to respond to requests. What makes is worse is that I have plenty of instances available with 0 requests on them.
Has anyone else seen this happen?
What can I do to fix it?I have gone as far as to spin up 15 extra instances (and paid through the nose for them) but nothing seems to send requests to the other idle instances reliably.
My bill has gone from 70-90c/day to $5-8/day without any code change or increase in traffic. In fact, I am losing traffic because of the huge latency.
QPS* Latency* Requests Errors Age Memory Availability
0.000 0.0 ms 1378 0 10:10:09 57.9 MBytes Dynamic
0.000 0.0 ms 1681 0 15:39:57 57.2 MBytes Dynamic
0.017 9687.0 ms 886 0 10:19:10 56.7 MBytes Dynamic

I recommend installing AppStats to get a picture of what's taking so long in each request. I'd guess that you're having some contention issues or large numbers of reads/writes caused by some new data configuration.
The idle instances won't help decrease latency - it looks like every request takes a long time, and with less than one request per minute (in this sample anyway), 10s requests could run serially on the same instance.

We have a similar problem in our app. In our case, we are under the impression that GAE's scheduler did a poor job in balancing requests to existing instances.
In some cases, the scheduler decided to spin up new instances instead of re-using already existing ones. Since spinning a new instance took 5 to >45 seconds, I suspect this might be what happened to you.
Try to investigate the following and see if it helps you:
Make sure your app has thread-safe enabled so that you could process concurrent requests. You could configure this in your app.yaml if you are using Python, or in your appengine-web.xml if you use Java. Of course, you also need to make sure that the code in your app is threadsafe.
In your application settings, if it is still set on automatic, change the minimum pending latency to a non-automatic setting. I'd suggest around 10 seconds for now, but you could experiment later on which setting would suit you the most. This force the scheduler to wait for a certain time to see if any instance is available within the time before spinning up a new instance.
Now, to answer your original question regarding sending all requests to same instance, as far as I know there is no way to address a specific front-end instance in order to direct the requests to that particular instance.
What you could do is migrate your app to use backend instances instead of the regular frontend instance. Backends provides a way to directly target any particular instance within it. You could deploy your app in a single backend to have more control on the number of instance that you spawn. And since using the backend bypass the scheduler, you would not encounter latencies caused by new instances spinning up.
The major drawback of using this approach is that you lose the auto-scalability benefit of using front-end instances. But seeing from your low daily billing, I think scalability is not yet a major concern for the scale of your app.

servlet/database: how to do fine-grained database connection and statement management (not bound to servlet lifetime)

Question/Environment
The goal of my web application is to be a handy interface to the database at our company.
I'm using:
Scalatra (as minimal web framework)
Jetty (as servlet container)
SBT (Simple Build Tool)
JDBC (to interface with the database)
One of the requirements is that each user can manage a number of concurrent queries, and that even when he/she logs off, the queries keep running and can be retrieved later on (or their completion status checked if they stopped for any reason).
I suppose queries will likely have to run in their own seperate thread.
I'm not even sure if this issue is orthogonal or not to connection pooling (which I'm definitely going to use, BoneCP and C3PO seem nice).
Summary
In short: I need to have very fine-grained control over the lifetime of database requests, and they cannot be bound to the servlet lifetime
What ways are there to fulfill my requirements? I've searched quite a bit on google and stack overflow and haven't found anything that addresses my problem, is it even possible?

What is missing from your stack is a scheduler. e.g http://www.quartz-scheduler.org/
A rough explanation:
Your connection pool (eg C3P0) will be bound to the application's lifecycle.
Your servlets will be sending query requests to the scheduler (these will be associated to the user requesting the query).
The scheduler will be executing queries as soon as possible, by using connections from the connection pool. It may also do so in a synchronized/serialized order (for each user).
The user will be able to see all query requests associated with him, probably with status (pending, completed with results etc).