serve different wsgiapplications depending on request domain on GAE with threadsafe:true - google-app-engine

what im trying to do is to load different applications (webapp2.WSGIApplication) depending on the request domain.
for example www.domain_1.com should load the application in app1.main.application while www.domain_2.com should load app2.main.appplication.
of course im on the same GAE appid and im using namespaces to separate the apps data.
this works pretty good with 'threadsafe:false' and a runner.py file where a function determines which application to return
it seems that with 'threadsafe:true' the first request loads the wsgiapplication into the instance and further requests dont execute the 'application dispatching' logic any more so the request gets a response from the wrong app.
im using python2.7 and webapp2
what is the best way to do this?
edit:
a very simplified version of my runner.py
def main():
if domain == 'www.mydomain_1.com':
from app_1 import application
namespace = 'app_1'
elif domain == 'www.domain_2.com':
from app_2 import application
namespace = 'app_2'
namespace_manager.set_namespace(namespace)
return wsgiref.handlers.CGIHandler().run(application)
if __name__ == '__main__':
main()
and in app.yaml
- url: /.*
script: app-runner.py

Your runner script is a CGI script. The full behavior of a CGI script with multithreading turned on is not documented, and the way the docs are written I'm guessing this won't be supported fully. Instead, the docs say you must refer to the WSGI application object directly from app.yaml, using the module path to a global variable containing the object, when multithreading is turned on. (CGI scripts retain their old behavior in Python 2.7 with multithreading turned off.)
The behavior you're seeing is explained by your use of imports. Within a single instance, each import statement only has an effect the first time it is encountered. After that, the module is assumed to be imported and the import statement has no effect on subsequent requests. You can import both values into separate names, then call run() with the appropriate value.
But if you want to enable multithreading (and that's a good idea), your dispatcher should be a WSGI application itself, stored in a module global referred to by app.yaml. I don't know offhand how to dispatch a request to another WSGI application from within a WSGI application, but that might be a reasonable thing to do. Alternatively, you might consider using or building a layer above WSGI to do this dispatch.

made it happen by subclassing webapp2.WSGIApplication and overriding __call__() which is called before dispatching to a RequestHandler.
prefixing routes (and removing the prefix in the handlers initialize) and substructuring config to be able to use the instance memory.
class CustomWSGIApplication(webapp2.WSGIApplication):
def __call__(self, environ, start_response):
routes, settings, ns = get_app(environ)
namespace_manager.set_namespace(ns)
environ['PATH_INFO'] = '/%s%s' %(ns, environ.get('PATH_INFO'))
for route in routes:
r, h = route # returns a tuple with mapping and handler
newroute = ('/%s%s'%(ns, r), h,)
self.router.add(newroute)
if settings:
self.config[ns] = settings
self.debug = debug
with self.request_context_class(self, environ) as (request, response):
try:
if request.method not in self.allowed_methods:
# 501 Not Implemented.
raise exc.HTTPNotImplemented()
rv = self.router.dispatch(request, response)
if rv is not None:
response = rv
except Exception, e:
try:
# Try to handle it with a custom error handler.
rv = self.handle_exception(request, response, e)
if rv is not None:
response = rv
except HTTPException, e:
# Use the HTTP exception as response.
response = e
except Exception, e:
# Error wasn't handled so we have nothing else to do.
response = self._internal_error(e)
try:
return response(environ, start_response)
except Exception, e:
return self._internal_error(e)(environ, start_response)

Related

scala - Gatling - I can't seem to use Session Variables stored from a request in a subsequent Request

The code:
package simulations
import io.gatling.core.Predef._
import io.gatling.http.Predef._
class StarWarsBasicExample extends Simulation
{
// 1 Http Conf
val httpConf = http.baseUrl("https://swapi.dev/api/films/")
// 2 Scenario Definition
val scn = scenario("Star Wars API")
.exec(http("Get Number")
.get("4")
.check(jsonPath("$.episode_id")
.saveAs("episodeId"))
)
.exec(session => {
val movie = session("episodeId").as[String]
session.set("episode",movie)
}).pause(4)
.exec(http("$episode")
.get("$episode"))
// 3 Load Scenario
setUp(
scn.inject(atOnceUsers(1)))
.protocols(httpConf)
}
Trying to grab a variable from the first Get request, and inject that variable into a second request, but unable to do so despite using the documentation. There might be something I'm not understanding.
When I use breakpoints, and navigate through the process, it appears the session execution happens AFTER both of the other requests have been completed (by which time is too late). Can't seem to make that session execution happen between the two requests.
Already answered on Gatling's community mailing list.
"$episode" is not correct Gatling Expression Language syntax. "${episode}" is correct.

How can I keep a backend module ('manual_scaling') running indefinitely?

I have defined a route in a backend module this way
(r'/_ah/start', 'ConversionTaskQueueWorker'),
My backend module is terminated like so:
But if I put a log statement in
class ConversionTaskQueueWorker(webapp2.RequestHandler):
def get(self):
"""Indefinitely fetch tasks and update the datastore."""
q = taskqueue.Queue(TASK_QUEUE_NAME)
while True:
LOG.info('Keeping it alive')
...
It stays up and running
Why it is so?
I don't want to flood my log with this message. Is there any alternative way to keep the backend module running?

Why does NDB context cache set a environment variable?

Looking though the Google NDB code, I can´t quite seem to work out why the context cache sets a environment variable.
The code in quesiton:
https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/ndb/tasklets.py
_CONTEXT_KEY = '__CONTEXT__'
def get_context():
# XXX Docstring
ctx = None
if os.getenv(_CONTEXT_KEY):
ctx = _state.current_context
if ctx is None:
ctx = make_default_context()
set_context(ctx)
return ctx
(...)
def set_context(new_context):
# XXX Docstring
os.environ[_CONTEXT_KEY] = '1'
_state.current_context = new_context
I know what it does, but why? (Speculation on my side removed, I don´t want to mislead answers)
Update:
The _state is based on this code:
class _State(utils.threading_local):
"""Hold thread-local state."""
current_context = None
(...)
https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/ndb/utils.py
# Define a base class for classes that need to be thread-local.
# This is pretty subtle; we want to use threading.local if threading
# is supported, but object if it is not.
if threading.local.__module__ == 'thread':
logging_debug('Using threading.local')
threading_local = threading.local
else:
logging_debug('Not using threading.local')
threading_local = object
Environment variables are specific/scoped to the request, so that provides a way of getting the context anywhere in your code without having to refer to a specific object/entity or provide a request specific lookup mechanism.
Some environment variables are set before the request is processed from the real environment, app.yaml.
Then for each request the environment variables are set from appengine_config.py , then the WSGI environment for the request, then the handler, and then other components contribute (ie your code may populate the environment), this is specific to each request.
So the environment is considered threadsafe (ie won't leak things across concurrent requests)

Is there a function to find the current handler for the named route in my app

I can get the currect app / WSGI instance with : webapp2.get_app() and the current request instance : webapp2.get_request() but how to get the current webapp2 handler instance from :
class MainHandler(webapp2.RequestHandler):
def get(self):
for :
webapp2.Route(r'/', handler=module.MainHandler, name='main'),
without using "self" to refer to this object. Is it possible?
The route object in the request object contains the handler name, but not the instance.
UPDATE : A solution has not been found yet. For now I store the handler (self) in a global, using dispatch of the webapp2.RequestHandler. But there must be another way.
To find a solution I study Nick Johnsonz "how to write your own Python webapp framework" : http://blog.notdot.net/2010/01/Writing-your-own-webapp-framework-for-App-Engine to understand how webapp2 works.
What have I done :
With webapp2.get_request() I can find the request.route and the request.route.handler_adapter instance. But not the handler instance. The handler instance is not saved.
Conclusion : I use the constructor of my webapp2.RequestHandler to save the handler instance (self) in the request registry (threadsafe). And I do not have to match the route name, because for every request new instances (handler and request) are created.
Your question (or perhaps the example code snippet) may need to be more well-defined in order for folks to provide a suitable answer.
As far as I can tell, you seem to be looking for a way to find the call stack of some function in order to determine the nearest RequestHandler instance. If that's the case, then this is more of a general Python question than webapp2, but the traceback module may be what you're looking for.

Using bottle.py and blobstore GAE

I recently started using bottle and GAE blobstore and while I can upload the files to the blobstore I cannot seem to find a way to download them from the store.
I followed the examples from the documentation but was only successful on the uploading part. I cannot integrate the example in my app since I'm using a different framework from webapp/2.
How would I go about creating an upload handler and download handler so that I can get the key of the uploaded blob and store it in my data model and use it later in the download handler?
I tried using the BlobInfo.all() to create a query the blobstore but I'm not able to get the key name field value of the entity.
This is my first interaction with the blobstore so I wouldn't mind advice on a better approach to the problem.
For serving a blob I would recommend you to look at the source code of the BlobstoreDownloadHandler. It should be easy to port it to bottle, since there's nothing very specific about the framework.
Here is an example on how to use BlobInfo.all():
for info in blobstore.BlobInfo.all():
self.response.out.write('Name:%s Key: %s Size:%s Creation:%s ContentType:%s<br>' % (info.filename, info.key(), info.size, info.creation, info.content_type))
for downloads you only really need to generate a response that includes the header "X-AppEngine-BlobKey:[your blob_key]" along with everything else you need like a Content-Disposition header if desired. or if it's an image you should probably just use the high performance image serving api, generate a url and redirect to it.... done
for uploads, besides writing a handler for appengine to call once the upload is safely in blobstore (that's in the docs)
You need a way to find the blob info in the incoming request. I have no idea what the request looks like in bottle. The Blobstoreuploadhandler has a get_uploads method and there's really no reason it needs to be an instance method as far as I can tell. So here's an example generic implementation of it that expects a webob request. For bottle you would need to write something similar that is compatible with bottles request object.
def get_uploads(request, field_name=None):
"""Get uploads for this request.
Args:
field_name: Only select uploads that were sent as a specific field.
populate_post: Add the non blob fields to request.POST
Returns:
A list of BlobInfo records corresponding to each upload.
Empty list if there are no blob-info records for field_name.
stolen from the SDK since they only provide a way to get to this
crap through their crappy webapp framework
"""
if not getattr(request, "__uploads", None):
request.__uploads = {}
for key, value in request.params.items():
if isinstance(value, cgi.FieldStorage):
if 'blob-key' in value.type_options:
request.__uploads.setdefault(key, []).append(
blobstore.parse_blob_info(value))
if field_name:
try:
return list(request.__uploads[field_name])
except KeyError:
return []
else:
results = []
for uploads in request.__uploads.itervalues():
results += uploads
return results
For anyone looking for this answer in future, to do this you need bottle (d'oh!) and defnull's multipart module.
Since creating upload URLs is generally simple enough and as per GAE docs, I'll just cover the upload handler.
from bottle import request
from multipart import parse_options_header
from google.appengine.ext.blobstore import BlobInfo
def get_blob_info(field_name):
try:
field = request.files[field_name]
except KeyError:
# Maybe form isn't multipart or file wasn't uploaded, or some such error
return None
blob_data = parse_options_header(field.content_type)[1]
try:
return BlobInfo.get(blob_data['blob-key'])
except KeyError:
# Malformed request? Wrong field name?
return None
Sorry if there are any errors in the code, it's off the top of my head.

Resources