Unable to connect to Google Cloud SQL from development server using SQLAlchemy - google-app-engine

I've been unsuccessful at connecting to Google Cloud SQL using SQLAlchemy 0.7.9 from my development workstation (in hopes to generate the schema using create_all()). I can't get passed the following error:
sqlalchemy.exc.DBAPIError: (AssertionError) No api proxy found for service "rdbms" None None
I was able to successfully connect to the database instance using the google_sql.py instancename, which initially opened up a browser to authorize the connection (which now appears to have cached the authorization, although I don't have the ~/Library/Preferences/com.google.cloud.plist file as https://developers.google.com/cloud-sql/docs/commandline indicates I should)
Here is the simple application I'm using to test the connection:
from sqlalchemy import create_engine
engine = create_engine('mysql+gaerdbms:///myapp', connect_args={"instance":"test"})
connection = engine.connect()
The full stacktrace is available here - https://gist.github.com/4486641

It turns out the mysql+gaerdbms:/// driver in SQLAlchemy was only setup to use the rdbms_apiproxy DBAPI, which can only be used when accessing Google Cloud SQL from a Google App Engine instance. I submitted a ticket to SQLAlchemy to update the driver to use the OAuth-based rdbms_googleapi when not on Google App Engine, just like the Django driver provided in the App Engine SDK. rdbms_googleapi is also the DBAPI that google_sql.py uses (remote sql console).
The updated dialect is expected to be part of 0.7.10 and 0.8.0 releases, but until they are available, you can do the following:
1 - Copy the updated dialect in the ticket to a file (ex. gaerdbms_dialect.py)
from sqlalchemy.dialects.mysql.mysqldb import MySQLDialect_mysqldb
from sqlalchemy.pool import NullPool
import re
"""Support for Google Cloud SQL on Google App Engine
Connecting
-----------
Connect string format::
mysql+gaerdbms:///<dbname>?instance=<project:instance>
# Example:
create_engine('mysql+gaerdbms:///mydb?instance=myproject:instance1')
"""
class MySQLDialect_gaerdbms(MySQLDialect_mysqldb):
#classmethod
def dbapi(cls):
from google.appengine.api import apiproxy_stub_map
if apiproxy_stub_map.apiproxy.GetStub('rdbms'):
from google.storage.speckle.python.api import rdbms_apiproxy
return rdbms_apiproxy
else:
from google.storage.speckle.python.api import rdbms_googleapi
return rdbms_googleapi
#classmethod
def get_pool_class(cls, url):
# Cloud SQL connections die at any moment
return NullPool
def create_connect_args(self, url):
opts = url.translate_connect_args()
opts['dsn'] = '' # unused but required to pass to rdbms.connect()
opts['instance'] = url.query['instance']
return [], opts
def _extract_error_code(self, exception):
match = re.compile(r"^(\d+):").match(str(exception))
code = match.group(1)
if code:
return int(code)
dialect = MySQLDialect_gaerdbms
2 - Register the custom dialect (you can override the existing schema)
from sqlalchemy.dialects import registry
registry.register("mysql.gaerdbms", "application.database.gaerdbms_dialect", "MySQLDialect_gaerdbms")
Note: 0.8 allows a dialect to be registered within the current process (as shown above). If you are running an older version of SQLAlchemy, I recommend upgrading to 0.8+ or you'll need to create a separate install for the dialect as outlined here.
3 - Update your create_engine('...') url as the project and instance are now provided as part of the query string of the URL
mysql+gaerdbms:///<dbname>?instance=<project:instance>
for example:
create_engine('mysql+gaerdbms:///mydb?instance=myproject:instance1')

I think I have a working sample script as follows. Can you try a similar thing and let me know how it goes?
However(and you might be already aware of it), if your goal is to create the schema on Google Cloud SQL instance, maybe you can create the schema against a local mysql server, dump it with mysqldump, and then import the schema to Google Cloud SQL. Using local mysql server is very convenient for development anyway.
from sqlalchemy import create_engine
from google.storage.speckle.python.api import rdbms as dbi_driver
from google.storage.speckle.python.api import rdbms_googleapi
INSTANCE = 'YOURPROJECT:YOURINSTANCE'
DATABASE = 'YOURDATABASE'
def get_connection():
return rdbms_googleapi.connect('', instance=INSTANCE, database=DATABASE)
def main():
engine = create_engine(
'mysql:///YOURPROJECT:YOURINSTANCE/YOURDATABASE',
module=dbi_driver,
creator=get_connection,
)
print engine.execute('SELECT 1').scalar()
if __name__ == '__main__':
main()

Related

Flask and connecting to a database confusion

I am currently creating a web application using Flask. My main issue at the moment understanding the concept of connecting to a database as there are many resources online which are confusing me in terms of establishing a solid connection to a database. The Syntax to SQL is not a problem as I have knowledge of that.
I am choosing SQLAlchemy with a dialect of SQLite instead of MySQL, PostgresSQL and etc.
My first question is: is choosing a dialect while using SQLAlchemy necessary? Can we not use SQLAlchemy as it is?
Second Question: I have seen many examples and tutorials online using "phpMyAdmin" or something similar to have a visual and interactive way to deal with their database (relations) in their localhost browser. Is this necessary to set-up before creating any type of database connection for any type of project?
Second Question (extension): To set up pypMyAdmin, there are tutorials such as "https://www.youtube.com/watch?v=hVHFPzjp064&t=238s" indicating to activate apache, activate PHP, and download MySQL to use a workbench. As stated in the second question, are these steps mandatory - as many tutorials don't seem to show how to set this up.
Third Question: Due to my project slowly growing, I am using the 'separation of concerns' concept. My file tree is the following:
After researching, I believe I should include database related code with the __init__.py file? Plus, of course updating the config file with the necessary configurations? What I don't understand is, the syntax used to connect to a database. The following code will show my code in both files stated above:
__init__.py
# This class will ultimately bring our entire application together.
from flask import Flask
from config import Config
from flask_sqlalchemy import SQLAlchemy
from flask_migrate import Migrate
# Creating Flask app.
app = Flask(__name__)
# Creating a database object which represents the database.
# Created a migration object which represents the migration engine.
db = SQLAlchemy(app)
migrate = Migrate(app, db)
# TODO Explain reasons for using this method:
# Using method to determine Flask environment from the following link:
# https://www.youtube.com/watch?v=GW_2O9CrnSU&t=366s
if app.config["ENV"] == "production":
app.config.from_object("config.ProductionConfig")
elif app.config["ENV"] == "testing":
app.config.from_object("config.TestingConfig")
else:
app.config.from_object("config.DevelopmentConfig")
# Importing views file to avoid circular import.
from app import views
from app import admin_views
from app import routes, models
config.py
# This class contains important information regarding the conifgurations for this application.
# It is good practice to keep configurations of the application in a seperate file. This enforces the
# practice of 'seperation of concerns'.
# There is a main class "Config" which has subclasses as illustrated below. The configuration settings
# are defined as class variables within the 'Config' class. As the application grows, we can create subclasses.
import os
basedir = os.path.abspath(os.path.dirname(__file__))
# The SECRET_KEY is important as it...
class Config(object):
DEBUG = False
TESTING = False
SECRET_KEY = '\xb6"\xc5\xce\xc2D\xd1*\x0c\x06\x83 \xbc\xdbM\x97\xe2\xf4OZ\xdc\x16Jv'
# The SQLAlchemy extension is connecting the location of the database from the URI variable.
# The fallback value if the value is not defined is given below as the URL.
SQLALCHEMY_DATABASE_URI = os.environ.get('DATABASE_URL') or \
'sqlite:///' + os.path.join(basedir, 'app.db')
# The 'modifications' config option is set to false as it prevents a signal from appearing whenever
# there is a change made within the database.
SQLALCHEMY_TRACK_MODIFICATIONS = False
class ProductionConfig(Config):
pass
class DevelopmentConfig(Config):
DEBUG = True
class TestingConfig(Config):
TESTING = True
I apologise if my questions seem all over the place. The more I research, the more confused I am becoming with being able to successfully connect to a database.
I would appreciate if someone could answer my concerns in an 'easy to understand' way.
You can avoid (or forestall) some amount of confusion by starting from
flask-sqlalchemy, which provides a convenience layer over SQLAlchemy.
It will arrange for SQLALCHEMY_DATABASE_URI to be turned into an SQLAlchemy "engine" for the database specified in the URI. SQLAlchemy does all of the heavy lifting. There's no need to do the create_engine() yourself when using flask-sqlalchemy.
Adding that following the organizational scheme from chapter 15 of the Flask Mega Tutorial will carry you quite far.

Total cpu utilization of running app engine flexible instances

I need to make decisions in an external system based on the current CPU utilization of my App Engine Flexible service. I can see the exact values / metrics I need to use in the dashboard charting in my Google Cloud Console, but I don't see a direct, easy way to get this information from something like a gcloud command.
I also need to know the count of running instances, but I think I can use gcloud app instances list -s default to get a list of my running instances in the default service, and then I can use a count of lines approach to get this info easily. I intend to make a python function which returns a tuple like (instance_count, cpu_utilization).
I'd appreciate if anyone can direct me to an easy way to get this. I am currently exploring the StackDriver Monitoring service to get this same information, but as of now it is looking super-complicated to me.
You can use the gcloud app instances list -s default command to get the running instances list, as you said. To retrieve CPU utilization, have a look on this Python Client for Stackdriver Monitoring. To list available metric types:
from google.cloud import monitoring
client = monitoring.Client()
for descriptor in client.list_metric_descriptors():
print(descriptor.type)
Metric descriptors are described here. To display utilization across your GCE instances during the last five minutes:
metric = 'compute.googleapis.com/instance/cpu/utilization'
query = client.query(metric, minutes=5)
print(query.as_dataframe())
Do not forget to add google-cloud-monitoring==0.28.1 to “requirements.txt” before installing it.
Check this code that locally runs for me:
import logging
from flask import Flask
from google.cloud import monitoring as mon
app = Flask(__name__)
#app.route('/')
def list_metric_descriptors():
"""Return all metric descriptors"""
# Instantiate client
client = mon.Client()
for descriptor in client.list_metric_descriptors():
print(descriptor.type)
return descriptor.type
#app.route('/CPU')
def cpuUtilization():
"""Return CPU utilization"""
client = mon.Client()
metric = 'compute.googleapis.com/instance/cpu/utilization'
query = client.query(metric, minutes=5)
print(type(query.as_dataframe()))
print(query.as_dataframe())
data=str(query.as_dataframe())
return data
#app.errorhandler(500)
def server_error(e):
logging.exception('An error occurred during a request.')
return """
An internal error occurred: <pre>{}</pre>
See logs for full stacktrace.
""".format(e), 500
if __name__ == '__main__':
# This is used when running locally. Gunicorn is used to run the
# application on Google App Engine. See entrypoint in app.yaml.
app.run(host='127.0.0.1', port=8080, debug=True)

Query an existing Google Cloud Datastore from Google app engine

I have a Entity with ~50k rows in Google Cloud Datastore, the stand alone not GAE. I am starting development with GAE and would like to query this existing datastore without having to import it to GAE. I have been unable to find a way to connect to an existing datastore Kind.
Basic code altered from Hello World and other guides im trying to get working as a POC.
import webapp2
import json
import time
from google.appengine.ext import ndb
class Product(ndb.Model):
type = ndb.StringProperty()
#classmethod
def query_product(cls):
return ndb.gql("SELECT * FROM Product where name >= :a LIMIT 5 ")
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
query = Product.query_product()
self.response.write(query)
app = webapp2.WSGIApplication([
('/', MainPage),
], debug=True)
Returned Errors are
TypeError: Model Product has no property named 'name'
Seems obvious that its trying to use a GAE datastore with the kind Product instead of my existing Datastore with Product already defined, But I cant find how to make that connection.
There is only one Google Cloud Datastore. App Engine does not have a datastore of its own - it works with the same Google Cloud Datastore.
All entities in the Datastore are stored for a particular project. If you are trying to access data from a different project, you will not be able to see it without going through special authentication.
I'm not too certain what it is you're trying to accomplish when you say that you would like to query this existing datastore without having to import it to GAE. I'm guessing that you have project A with the datastore with 50k rows, and you're starting project B. And you want to access the project A datastore from project B. If this is the case, and if you're trying to access the datastore from a different project, then maybe this previous answer that mentions remote api can help you.
Below is working code. I was pretty close at the time I made this original post but the reason I was getting no data back was because I was running my App locally. As soon as I actually deployed my code to App Engine it pulled from Datastore no problem.
import webapp2
import json
import time
from google.appengine.datastore.datastore_query import Cursor
from google.appengine.ext import ndb
class Product(ndb.Model):
name = ndb.StringProperty()
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
query = ndb.gql("SELECT * FROM Product where name >= 'a' LIMIT 5 ")
output = query.fetch()
#query = Product.query(Product.name == 'zubo - pre-owned - nintendo ds')
#query = Product.query()
#output = query.fetch(10)
self.response.write(output)
app = webapp2.WSGIApplication([
('/', MainPage),
], debug=True)

devappserver2, remote_api, and --default_partition

To access a remote datastore locally using the original dev_appserver I would set --default_partition=s as mentioned here
In March 2013 Google made devappserver2 the default development server, and it does not support --default_partition resulting in the original, dreaded:
BadRequestError: app s~appname cannot access app dev~appname's data
It appears like the first few requests are served correctly with
os.environ["APPLICATION_ID"] == 's~appname'
Then a subsequent request results in a call to /_ah/warmup and then
os.environ["APPLICATION_ID"] == 'dev~appname'
The docs specifically mention related topics but appear geared to dev_appserver here
Warning! Do not get the App ID from the environment variable. The development server simulates the production App Engine service. One way in which it does this is to prepend a string (dev~) to the APPLICATION_ID environment variable, which is similar to the string prepended in production for applications using the High Replication Datastore. You can modify this behavior with the --default_partition flag, choosing a value of "" to match the master-slave option in production. Google recommends always getting the application ID using the get_application_id() method, and never using the APPLICATION_ID environment variable.
You can do the following dirty little trick:
from google.appengine.datastore.entity_pb import Reference
DEV = os.environ['SERVER_SOFTWARE'].startswith('Development')
def myApp(*args):
return os.environ['APPLICATION_ID'].replace("dev~", "s~")
if DEV:
Reference.app = myApp

AppEngine datastore - backup programmatically

I would like to backup my app's datastore programmatically, on a regular basis.
It seems possible to create a cron that backs up the datastore, according to https://developers.google.com/appengine/articles/scheduled_backups
However, I require a more fine-grained solution: Create different backup files for dynamically changing namespaces.
Is it possible to simply call the /_ah/datastore_admin/backup.create url with GET/POST?
Yes; I'm doing exactly that in order to implement some logic that couldn't be done with cron.
Use the taskqueue API to add the URL request, like this:
from google.appengine.api import taskqueue
taskqueue.add(url='/_ah/datastore_admin/backup.create',
method='GET',
target='ah-builtin-python-bundle',
params={'kind': ('MyKind1', 'MyKind2')})
If you want to use more parameters that would otherwise go into the cron url, like 'filesystem', put those in the params dict alongside 'kind'.
Programmatically backup datastore based on environment
This comes in addition to Jamie's answer. I needed to backup the datastore to Cloud Storage, based on the environment (staging/production). Unfortunately, this can no longer be achieved via a cronjob so I needed to do it programmatically and create a cron to my script. I can confirm that what's below is working, as I saw there were some people complaining that they get a 404. However, it's only working on a live environment, not on the local development server.
from datetime import datetime
from flask.views import MethodView
from google.appengine.api import taskqueue
from google.appengine.api.app_identity import app_identity
class BackupDatastoreView(MethodView):
BUCKETS = {
'app-id-staging': 'datastore-backup-staging',
'app-id-production': 'datastore-backup-production'
}
def get(self):
environment = app_identity.get_application_id()
task = taskqueue.add(
url='/_ah/datastore_admin/backup.create',
method='GET',
target='ah-builtin-python-bundle',
queue_name='backup',
params={
'filesystem': 'gs',
'gs_bucket_name': self.get_bucket_name(environment),
'kind': (
'Kind1',
'Kind2',
'Kind3'
)
}
)
if task:
return 'Started backing up %s' % environment
def get_bucket_name(self, environment):
return "{bucket}/{date}".format(
bucket=self.BUCKETS.get(environment, 'datastore-backup'),
date=datetime.now().strftime("%d-%m-%Y %H:%M")
)
You can now use the managed export and import feature, which can be accessed through gcloud or the Datastore Admin API:
Exporting and Importing Entities
Scheduling an Export

Resources