Connection pool with Google App Engine and Google Cloud SQL - google-app-engine

There are numerous questions about using a db connection pool with Google App Engine, but a lot has changed recently. Up to this point, I could never get a connection pool to work with GAE. However, I think some recent develops may allow connection pooling to work, which may be why it is mentioned in the Google documentation (which seems to have recently been updated).
https://cloud.google.com/sql/docs/mysql/connect-app-engine
Can someone confirm that connection pools can be used?
1) We used Google Cloud SQL 1st gen and the database could deactivate (go to sleep). This would make any existing connections stale.
With a 2nd gen database, there is no deactivtion of databases. So this may address the problem.
2) Many connection pool implementations used threads.
With Java 8 being supported on GAE, it looks like threads are permitted.
3) Some people suggest that GAE's limited number of database connections (12) are a reason to use connection pools. The connection pool size could be set to GAE's limit and thus an app would never exceed the limit.
a) First, documentation indicates a much larger number of connections, based on the size of the database.
https://cloud.google.com/sql/docs/quotas
b) Second, if there is a limit for a GAE app, is the limit per individual server instance or for an entire GAE app?
Any confirmation that the above thinking makes sense would be appreciated.

Regarding 1) Yes, the Cloud SQL instances of 2nd generation, your instances don't deactivate unless it's for maintenance etc.
2) I don't see why you can't use threads to connect to a 2nd generation Cloud SQL database. With Java 8, you can absolutely do that. To check how many threads you have open, you can run mysql> SHOW STATUS WHERE Variable_name = 'Threads_connected';
For 3a), I would go with the official documentation link that you provided already but remember that database connections consume resources on the server and the connecting application. Always use good connection management practices to minimize your application's footprint and reduce the likelihood of exceeding Cloud SQL connection limits. The limit of 12 connections was indeed in place in the past but it doesn't exist anymore.
3b) When a limit or quota refers to a Google App Engine app, then it's for the whole app unless it's specified that it's per instance. More specifically for Cloud SQL connections, you can find the limits here and there is actually a limit that is specific to instances. You can't have more than 100 concurrent connections for each App Engine instance running in a standard environment.
I hope that helps!

Related

How can you simulate "serverless" Cloud SQL?

Problem: Cloud SQL instances run indefinitely and are monetarily expensive to host.
Goal: Save money while not compromising on database availability.
It has been almost four years and Google Cloud has not fulfilled this feature request that has already been implemented on AWS with their Aurora RDS.
Since it does not seem that on demand Cloud SQL that auto-scales to zero is coming any time soon, will the following strategy work?
Have instances of Cloud SQL, a Baby and a Papa. They follow the master/slave replica principle, with twist. The Baby
instance is small with few vCPU's and low memory, it always runs, but
does so cheaply. However, the Papa instance is expensive with high vCPU and high
memory but runs only when needed.
To begin, only the Baby Cloud SQL instance is running so it is the master that accepts reads/writes. The Papa Cloud SQL instance is not running.
Since I am using standard app engine that
will auto-scale to zero with no traffic, schedule a cron job that
checks every 10 min if no app engine instances exists. In this case,
the application has no traffic. If this is not the case, the Papa Cloud SQL instance is started. Once started, the Papa instance
becomes the master that accepts reads/writes while the Baby instance
becomes a slave replica capable of only reads.
If the cron job detects the app engine has zero instances running, this means there is no traffic. Thus, the Papa Cloud SQL instance is
stopped and the Baby Cloud SQL replica is promoted to master and can accept reads/writes.
In this way, the expensive Papa instance runs on demand. If there is a traffic
spike when the Papa instance is stopped or rebooting, the Baby
instance will still be able to respond to requests.
This strategy ensures that the expensive Papa Cloud SQL instance only runs with traffic. Is this Baby-Papa dynamic possible on Google Cloud?
Cloud SQL has an Admin API that can be used to manipulate your Cloud SQL instances in such a way. You could build pieces of what you are describing using Cloud Scheduler to trigger a Cloud Function which uses the API to start and stop instances, or even promote/demote them to master.
However, it's probably a bad idea. These operations can take several minutes to complete and would give you dramatic increases to cold start times for requests. Additionally, SQL servers prefer to be long running for a reason - they use resources to cache and optimize queries to improve performance. Start, stoping, and resizing instances can cause you to lose these benefits.
It's better to consider - do you actually need a relational database? If not, it's probably better to use something like Firestore, which is a serverless product.
If you determine that you do indeed need a relational database, can you optimize your use for a smaller Cloud SQL instance? Can you cache queries using Memorystore or Firestore as listed above, or instead use the services I described above to export the results on a timed basis, which would be easier for your app to consume?
Would it be better to start and stop your Cloud SQL instance when there is no traffic? If you traffic is based around certain predictable times, you could schedule your instance to resize at the start and stop of these time periods.
Finally, if cost is really an option, you could run your own SQL server on a GCE instance. This means you have to do pretty much all of the management yourself (install, updates, maintenance, etc), but it would be cheaper.
All of these are probably much more functional solutions than trying to shoehorn non-serverless infrastructure to match a serverless workload.

integrate kafka with Odoo

I have Odoo front end on aws ec2 instance and connected it with postgresql on ElephentQl site with 15 concurrent connections
so I want to make sure that this connection limits will pose no problem so i wanna use kafka to perform database write instead of Odoo doing it directly but found no recourses online to help me out
Is your issue about Connection Pooling? PostgreSQL includes two implementations of DataSource for JDBC 2 and two for JDBC 3, as shown here.
dataSourceName String Every pooling DataSource must have a unique name.
initialConnections int The number of database connections to be created when the pool is initialized.
maxConnections int The maximum number of open database connections to allow.
When more connections are requested, the caller will hang until a connection is returned to the pool.
The pooling implementations do not actually close connections when the client calls the close method, but instead return the connections to a pool of available connections for other clients to use. This avoids any overhead of repeatedly opening and closing connections, and allows a large number of clients to share a small number of database connections.
Additionally, you might want to investigate, Pgbouncer. Pgbouncer is a stable, in-production connection pooler for PostgreSQL. PostgreSQL doesn’t realise the presence of PostgreSQL. Pgbouncer can do load handling, connection scheduling/routing and load balancing. Read more from this blog that shows how to integrate this with Odoo. There are a lot of references from this page.
Finally, I would second OneCricketeer's comment, above, on using Amazon RDS, unless you are getting a far better deal with ElephantSQL.
On using Kafka, you have to realise that Odoo is a frontend application that is synchronous to user actions, therefore you are not architecturally able to have a functional system if you put Kafka in-between Odoo and the database. You would input data and see it in about 2-10 minutes. I exaggerate but; If that is what you really want to do then by all means, invest the time and effort.
Read more from Confluent, the team behind Kafka that came out of LinkedIn on how they use a solution called BottledWater to do some cool streams over PostgreSQL, that should be more like what you want to do.
Do let us know which option you selected and what worked! Keep the community informed.

max no of allowed concurrent connections of google cloud sql from app engine

I am using 2nd Gen Google Cloud Sql and running an app engine instance on flexible environment. The app engine connects to the cloud sql to run sql queries.
I have set up a connection pool to create min of 10 connections and max of 200.
However, I read this - https://cloud.google.com/sql/docs/mysql/diagnose-issues
which says - "Each App Engine instance running in a standard environment cannot have more than 12 concurrent connections to a Google Cloud SQL instance."
It however says nothing about flexible environment of app engine.
I tried updaing the min connections in my connection pool as 20. On the dashboard I could see 20 Active Connections. Seems the 12 concurrent connection limit is for standard environment. However I could not find any document confirming the same.
Can anyone educate me on the limits of concurrent connections from flexible environment
There are no limits specific to App Engine Flexible. You can create as many connections as the Cloud SQL instance will allow.
The number of maximum number of allowed connections is described here:
https://cloud.google.com/sql/faq#sizeqps
Keep in mind that this limit is not an indicator of how many connections your instance can handle for your workload. For example, if you have a heavy workload and you use a n1-standard-1 instance, it's unlikely that you can utilize all 4000 connections.

Is it possible to limit number of connections used by Entity Framework?

I've noticed that on a NopCommerce site we host (which uses Entity Framework) that if I run a crawler on the site (to check for broken links) it knocks the entire webserver offline for a few minutes and no other hosted sites respond. This seems to be because Entity Framework is opening 30-odd database connections and runs hundreds of queries per second (about 20-40 per page view).
I cannot change how EF is used by NopCommerce (it would take weeks) or change the version of EF being used, so can I mitigate the effects it has on SQL Server by limiting how many concurrent connections it uses, to give other sites hosted on the same server a fairer chance at database access?
What I'm ideally looking to do, is limit the number of concurrent DB connections to about 10, for a particular application.
I think the best you can do is use the Max Pool Size setting in the connection string. This limits the maximum number of connections in the connection pool, and I think this means that's the maximum number of connections the application will ever use. What I'm not sure of though, is if it can't get a connection from the pool, will it cause an exception. I've never tried limiting the connections in this manner.
Here's a litle reading on the settings you can put in a ADO.NET connection string:
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectionstring%28v=vs.100%29.aspx
And here's a little more reading on "Using Connection Pooling":
http://msdn.microsoft.com/en-us/library/8xx3tyca%28v=vs.100%29.aspx

Is it possible to host SQL server in the cloud and connect an ASP.NET app hosted on DiscountASP?

Does anyone know if it's possible to host a SQL server in the cloud and connect an ASP.NET app hosted on DiscountASP?
I'd like to consolidate my SQL Server instances but keep the web app hosting where it's at. There are various reasons for I want to do this and I don't particularly want to get into it. I don't have any experience with cloud computing but I'm trying to wrap my head around it. It seems to be similar to standard hosting except for the metered billing and flexibility. If my idea is nuts and flawed, feel free to let me know, but be nice. ;-)
Yes, but subject to the following:
Unless your cloud provider offers a VPN or you otherwise encrypt the data, all traffic will be unencrypted over the internet
It will probably be slow as every DB operation goes over the internet.
Some cloud providers (Amazon for sure) charge for internet traffic but not for internal traffic.
Depnding on what features you use of SQL Server, SQL Azure hosting is close to production, it goes live for the US datacentres in January (PDC announcement) and billing for the service starts in February.
At present though the database limits are purely 1GB and 10GB which is a bit limiting, but if your data can be partitioned across databases and the app can be changed to understand this then the limitation is not so harsh.
It has limitations on functionality etc but it is a potential choice that could be investigated - but not the only one.

Resources