Google App Engine Cloud SQL too many connections - google-app-engine

I have an app that is running on Google App Engine and uses the Task Queue Api to do some of the heavier lifting in the background. Some of those tasks need to connect to Cloud SQL to do their work. At scale I get too many tasks attempting to connect to Cloud SQL at once. What I need is some sort of data service layer of a shared client so that the tasks themselves aren't making individual connections to Cloud SQL. If anyone has any ideas I'd love to hear them.

Yes, you can actually do this, but it will take a little bit of planning, coding, and configuration in your side.
One idea is to use a Pull Queue (instead of Push Queues). With a pull queue you can schedule your tasks and execute them in a separate module of your application. The hardware of that module can be configured separately from your main module, thus you can avoid too many instances serving requests which in turns will allow you to better use connection pooling.
Of course, depending on the traffic you are getting you might want to decide on how many concurrent backend instances you want running (and connecting to your DB) to avoid/minimize contention.
Easier said than done, but here are two resources that will help you out:
App Engine Modules - https://developers.google.com/appengine/docs/java/modules/
Pull Queues - https://developers.google.com/appengine/docs/python/taskqueue/overview-pull

Related

Cloud Tasks and the enforced rate

We are using Cloud Tasks to call an "on-prem" API gateway (using Http request). This API gateway (IBM API Connect) sits in front off an on-prem system (Oracle). This back end system can at times be very slow. >5s.
We are desperately trying to increase the throughput but “adjusting” the Cloud Task queue settings (like -max-dispatches-per-second etc).
gcloud tasks queues update queue-1 --max-dispatches-per-second=8 --max-concurrent-dispatches=16
But all we see when we “crank up” the Cloud Task settings is that yellow triangle telling us that we are “enforced" to lower rate due to "system resources".
My understanding is that the yellow triangle shows up due to “errors” from the API gateway we call. Basically GCP/Cloud Tasks re-acts "by it self" based on return codes/errors/time-outs/latency etc from the API end-point we are calling with the result of a very low rate/thru-put. Is this understanding correct? Can someone verify?
The GUI does say that "or because currently there is no instance available to execute a request". What instance are they talking about? So to me that means that there is a possibility that it's "GCP specific" resources that comes into the picture here and have an effect on the "enforced rate"? Or?
Anyway, any help/insight would be appreciated.
Thanks
The error message you are seeing can be prompted by any of the 2 things you are mentioning: "Enforced rates" or "lack of GCP resources at the time of request".
The "Enforced rates" that Cloud tasks is refering to are the ones mentioned here. As you mention, this is due to the server being overloaded and returning too many errors. When this happens Cloud tasks acts by itself and will slow down execution until errors stop.
The "currently there is no instance available to execute a request" message you are seeing is that GCP does not have resources to create the request. Remember that cloud tasks is a managed service so this means that requests are created by GCP fully managed compute engine instances. This is a bit rare, although it does happen from time to time.
In order to make sure which of these 2 issues is the one you are running into, I would recommend you to check your Stackdriver logs and see if you are getting a high amount of errors on the Cloud Tasks filter as if this is the case, most likely you are running into the "Enforced rates" territory.
Hope you find this useful!

Tomcat via Apache Server going down after too many connections

I have an Apache (2.4) Server that serves content through the AJP connector on a Tomcat 7 Server.
One of my clients manages to kill the tomcat instance after running too many concurrent connections to a JSP JSON Api service. (Apache still works, but tomcat falls over. Restarting Tomcat brings it back up) there are no errors in tomcats logs.
I would like to protect the site from falling over like that, but I am not sure what configurations to change.
I do not want to limit the number of concurrent connections as there are legitimate use cases for that,
My Tomcat memory settings are :
Initial Memory pool : 1280MB
Maximum memory pool : 2560MB
which I assumed was plenty.
It might be worth mentioning that the API service relies on multiple, possibly heavy MySQL connections.
Any advice would be most appreciated.
Why don't you'd slowly switch your most used/ important application features to microservices architecture and dockerize your tomcat servers to be able to manage multiple instances of your application. This will hopefully help your application to manage multiple connections without impacting the overall performance of the servers at the job.
If you are talking about scaling, you need to do the horizontal scaling her with multiple tomcat servers.
If you cannot limit user connections & still want the app to run smooth, then you need to scale. Architectural change to microservices is an option but may not be possible always for a production solution.
The best to think about is running multiple tomcats sharing the load. There are various ways to do this. With your tech stack, I feel the Apache 2 load balancer plugin in combination with Tomcat will do best.
Have an example here.
Now, with respect to server capacity, db connection capacity etc, you might also need to think about vertical scaling.

Google Cloud Bigtable Python Client Performance Issue

I'm running into a performance issue with Google Cloud Bigtable Python Client. I'm working on a flask API that writes to and reads from a GCP Bigtable instance. The API uses the python client to communicate with Bigtable, and was deployed to GCP App Engine flexible environment.
Under low traffic, the API works fine. However during a load test, the endpoints that reads and writes to Bigtable suffers a huge performance decrease compare to a similar endpoint that doesn't communicate with Bigtable. Also, a large percentage of requests went to the endpoint receives a 502 Bad Gateway, even when health check was turned off in App Engine.
I'm aware of that the client is currently in Alpha. I wonder if the performance issue is known, or if anyone also ran into the same issue
Update
I found a documentation from Google stating:
There are issues with the network connection. Network issues can
reduce throughput and cause reads and writes to take longer than
usual. In particular, you'll see issues if your clients are not
running in the same zone as your Cloud Bigtable cluster.
In my case, my client is in a different region, by moving it to the same region had a huge increase in performance. However the performance issue still exist, and the recommendation from the documentation is to put client in the same zone as Bigtable.
I also considered using Container engine or Compute Engine where it is easier to specify the zone, but I want stay with App Engine for its autoscale functionality and managed services.
Bigtable client take somewhere between 3 ms to 20 ms to complete each request, and because python is single threaded, during that period of time it will just wait until the response comes back. The best solution we found was for any writes, publish the request to Pubsub, then use Dataflow to write to Bigtable. It is significantly faster because publishing a message in Python would take way below 1 ms to complete, and because Dataflow can be set to exactly the same region as Bigtable, and it is easy to parallel, it can write much faster.
Though it doesn't solve the scenario where you need frequent read or write need to be instantaneous

Google App Engine for long running but low CPU tasks, or long-polling?

App Engine has been great for requests that process quickly with no external API calls to databases or caches or third-party resources, but we've found that introducing any sort of "longer running" component or external latency (for example in a HTTP POST operation that runs asynchronously in the background and might take a second or two to process a few more intense database queries... totally invisible and OK from a UX perspective on the client-side because it's asynchronous but expensive to App Engine billing since it's long running) ... the "instance hours" compound and drive costs up considerably.
These sorts of expense inducing situations where a request is literally just waiting for a response from an external resource and requiring almost zero CPU during their idling seem avoidable, but I'm not sure if it's avoidable with App Engine.
It's almost like a "long poll" where the response might be left open but doing nothing.
Is there a way to do this on App Engine without just paying an insane amount for instance hours, or would we be better off moving to Compute Engine or EC2? Does it scale automatically based on CPU load, or is it based solely on open and perhaps inactive requests in total count? — threadsafe is indeed enabled.
There are really two ways to go about this one (top of mind).
Use Task Queues!
If the work doesn't need to be exactly at the same time of the request, this is exactly what [task queues] in App Engine are for. They allow you to put a job on a queue, and have another module pick up the work. They're kind of great because you can separately scale your front end and back end processes.
If that doesn't work....
Use App Engine Flexible
Under the hood App Engine Flexible is just running GCE instances. The cost structure is entirely different, since you persistently have a VM running in the background serving your requests.
Hope this helps!
What you're really worried about here is how App Engine scales your instances. Because many of your requests require few resources, your app might be able to handle many more concurrent requests on a single instance than normal. You can look into parameters that shape scaling here. Of particular interest:
max_concurrent_requests The number of concurrent requests an automatic scaling instance can accept before the scheduler spawns a new instance (Default: 8, Maximum: 80).
There is a danger here, where an instance may fill up with non-long-polling requests and become overburdened. To prevent that, you could isolate your long-polling requests into their own service and set its scaling parameters separately from the rest of your app.

Moving app to GAE

Context:
We have a web application written in Java EE on Glassfish + MySQL with JDBC for storage + XMPP connection for GCM CCS upstream messaging + Quartz scheduler library. Those are the main components of the app.
We're wrapping up our app development stage and we're trying to figure out the best option for deploying it, considering future scalability, both in number of users and their geographical location (ex, multiple VMS on multiple continents, if needed). Currently we're using a single VM instance from DigitalOcean for both the application server and the MySQL server, which we can then scale up with some effort (not much, but probably more than GAE).
Questions:
1) From what I understood, Cloud SQL is replicated in multiple datacenters across the world, thus storage-wise, the geographical scalability is assured. So this is a bonus. However, we would need to update the DB interaction code in the app to make use of Cloud SQL structure, thus being locked-in to their platform. Has this been a problem for you ? (I haven't looked at their DB interaction code yet, so I might be off on this one)
2) Do GAE instances work pretty much like a normal VM would ? Are there any differences regarding this aspect ? I've understood the auto-scaling capability, but how are they geographically scalable ? Do you have to select a datacenter location for each individual instance ? Or how can you achieve multiple worldwide instance locations as Cloud SQL does right out of the box ?
3) How would the XMPP connection fare with multiple instances ? Considering each instance is a separate VM, that means each instance will have a unique XMPP connection to the GCM CCS server. That would cause multiple problems, ex if more than 10 instances are being opened, then the 10 simultaneous XMPP connections limit cap will be hit.
4) How would the Quartz scheduler fare with the instances ? Right now we're saving the scheduled jobs in the MySQL database and scheduling them at each server start. If there are multiple instances, that means the jobs will be scheduled on each instance; so if there are multiple instances, the jobs will be scheduled multiple times. We wouldn't want that.
So, if indeed problems 3 & 4 are like I described, what would be the solution to those 2 problems ? Having a single instance for the XMPP connection as well as a single instance for the Quartz scheduler ?
1) Although CloudSQL is a managed replicated RDBMS, you still need to chose a region when creating an instance. That said, you cannot expect the latency to be great seamlessly globally. You would still need to design a proper architecture to achieve that.
2) GAE isn't a VM in any sense. GAP is a PaaS and, therefore, a runtime environment. You should expect several differences: you are restricted to Java, PHP, GO and Python, the mechanisms GAE provide do you out-of-the-box and compatible third-parties libraries. You cannot install a middleware there, for example. On the other hand, you don't have to deal with anything from the infrastructure standpoint, having transparent high-availability and scalability.
3) XMPP is not my strong suit, but GAE offers a XMPP service, you should take a look at it. More info: https://developers.google.com/appengine/docs/java/xmpp/?hl=pt-br
4) GAE offers a cron service that works pretty well. You wouldn't need to use Quartz.
My last advice is: if you want to migrate an existing application, your best choice would probably be GCE + CloudSQL, not GAE.
Good luck!
Cheers!

Resources