Full Text Search with ndb endpoints_proto_datastore

Full Text Search with ndb endpoints_proto_datastore - google-app-engine

I've been using endpoints_proto_datastore library to build my endpoints APIs and I am trying to figure out how to return a list of records result that I've retrieved from the search API.
The #query_method seems to require a Query type to return and it'll do the fetch call internally. How would I go about implementing an endpoint method that would handle full-text search? Do I just define a custom protorpc requets Message and response Message and skip the endpoints_proto_datastore library all together?
This is what I tried and got an error that list doesn't have ToMessage attribute.
Encountered unexpected error from ProtoRPC method implementation: AttributeError ('list' object has no attribute 'ToMessage')
Traceback (most recent call last):
File "google_appengine/lib/protorpc-1.0/protorpc/wsgi/service.py", line 181, in protorpc_service_app
response = method(instance, request)
File "google_appengine/lib/endpoints-1.0/endpoints/api_config.py", line 1329, in invoke_remote
return remote_method(service_instance, request)
File "google_appengine/lib/protorpc-1.0/protorpc/remote.py", line 412, in invoke_remote_method
response = method(service_instance, request)
File "third_party/py/endpoints_proto_datastore/ndb/model.py", line 1416, in EntityToRequestMethod
response = response.ToMessage(fields=response_fields)
AttributeError: 'list' object has no attribute 'ToMessage'
Here's a general view of the code:
class MyModel(EndpointsModel):
SearchSchema = MessageFieldsSchema(('q',))
_query_string = None
def QueryStringSet_(self, value):
self._query_string = value
#EndpointsAliasProperty(name='q', setter=QueryStringSet_)
def query_string(self):
return self._query_string
class MyServices(...):
#MyModel.method(
request_fields=MyModel.SearchSchema,
name='search', path='mymodel/search')
def SearchMyModel(self, request):
return MyModel.Search(request.q)

If you were using Java then the answer would be to use
import com.google.api.server.spi.response.CollectionResponse;
In python you need to create Response Message Classes.

Related

Why can't I invoke sagemaker endpoint with either bytes or file as payload

I have deployed a linear regression model on Sagemaker. Now I want to write a lambda function to make prediction on input data. Files are pulled from S3 first. Some preprocessing is done and the final input is a pandas dataframe. According to boto3 sagemaker documentation, the payload can either be byte-like, or file. So I have tried to convert the dataframe to a byte array using code from this post
# Convert pandas dataframe to byte array
pred_np = pred_df.to_records(index=False)
pred_str = pred_np.tostring()
# Start sagemaker prediction
sm_runtime = aws_session.client('runtime.sagemaker')
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=pred_str,
ContentType='text/csv',
Accept='Accept')
I printed out pred_str which does seem like a byte array to me.
However when I run it, I got the following Algorithm Error caused by UnicodeDecodeError:
Caused by: 'utf8' codec can't decode byte 0xed in position 9: invalid continuation byte
The traceback shows python 2.7 not sure why that is:
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/serve.py", line 465, in invocations
data_iter = get_data_iterator(payload, **content_parameters)
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/io/serve_helpers.py", line 99, in iterator_csv_dense_rank_2
payload = payload.decode("utf8")
File "/opt/amazon/python2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
Is the default decoder utf_8? What is the right decoder I should be using? Why is it complaining about position 9?
In addition, I also tried to save the dataframe to csv file and use that as payload
pred_df.to_csv('pred.csv', index=False)
with open('pred.csv', 'rb') as f:
payload = f.read()
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=payload,
ContentType='text/csv',
Accept='Accept')
However when I ran it I got the following error:
Customer Error: Unable to parse payload. Some rows may have more columns than others and/or non-numeric values may be present in the csv data.
And again, the traceback is calling python 2.7:
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/serve.py", line 465, in invocations
data_iter = get_data_iterator(payload, **content_parameters)
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/io/serve_helpers.py", line 123, in iterator_csv_dense_rank_2
It doesn't make sense at all because it is standard 6x78 dataframe. All rows have same number of columns. Plus none of the columns are non-numeric.
How to fix this sagemaker issue?

I was finally able to make it work with the following code:
payload = io.StringIO()
pred_df.to_csv(payload, header=None, index=None)
sm_runtime = aws_session.client('runtime.sagemaker')
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=payload.getvalue(),
ContentType='text/csv',
Accept='Accept')
It is very import to call getvalue() function for the payload while invoking the endpoint. Hope this helps

Django Model MultipleObjectsReturned

I am using django ORM to talk to a SQL Server DB.
I used the .raw() method to run a query:
#classmethod
def execute_native_queries(cls, query):
return cls.objects.raw(query)
I later type-cast it into a list by running: data = list(modelName.execute_native_queries(query))
While iterating through the list, i would call certain columns as such:
for entry in data:
a = entry.colA
b = entry.colB
c = entry.colC
For certain entries, I am able to run one loop-iteration fine, however for some i get the following error:
api.models.modelName.MultipleObjectsReturned: get() returned more than one modelName-- it returned 2!
What i do not get is how come this error is surfacing?
EDIT: Added the stacktrace
Traceback (most recent call last):
File "<full filepath>\a.py", line 178, in method1
'vc': data.vc,
File "C:\FAST\Python\3.6.4\lib\site-packages\django\db\models\query_utils.py", line 137, in __get__
instance.refresh_from_db(fields=[self.field_name])
File "C:\FAST\Python\3.6.4\lib\site-packages\django\db\models\base.py", line 605, in refresh_from_db
db_instance = db_instance_qs.get()
File "C:\FAST\Python\3.6.4\lib\site-packages\django\db\models\query.py", line 403, in get
(self.model._meta.object_name, num)
api.models.modelName.MultipleObjectsReturned: get() returned more than one modelName-- it returned 2!

GAE: 'Response' object has no attribute 'set_cookie'

I am trying to test a customized Session for Google AppEngine (1.9.15). It uses response.set_cookie(). Printing dir(response) doesn't show the function exists. Any ideas how I can get a response object that has this function?
from google.appengine.ext import webapp
response = webapp.Response()
pprint(dir(response))
google.appengine.ext.webapp._webapp25.Response object at 0x100e6d110>
['_Response__HTTP_STATUS_MESSAGES',
'_Response__status',
'_Response__wsgi_headers',
'__class__',
'__delattr__',
'__dict__',
'__doc__',
'__format__',
'__getattribute__',
'__hash__',
'__init__',
'__module__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'__weakref__',
'clear',
'has_error',
'headers',
'http_status_message',
'out',
'set_status',
'status',
'status_message',
'wsgi_write']

According to the documentation the response is built by the request handler (in response to a request), so try to print it inside one of the handlers of your app:
The request handler instance builds the response using its response
property. This is initialized to an empty WebOb Response object by the
application.
The response object’s acts as a file-like object that can be used for
writing the body of the response:
class MyHandler(webapp2.RequestHandler):
def get(self):
self.response.write("<html><body><p>Hi there!</p></body></html>")

serve different wsgiapplications depending on request domain on GAE with threadsafe:true

what im trying to do is to load different applications (webapp2.WSGIApplication) depending on the request domain.
for example www.domain_1.com should load the application in app1.main.application while www.domain_2.com should load app2.main.appplication.
of course im on the same GAE appid and im using namespaces to separate the apps data.
this works pretty good with 'threadsafe:false' and a runner.py file where a function determines which application to return
it seems that with 'threadsafe:true' the first request loads the wsgiapplication into the instance and further requests dont execute the 'application dispatching' logic any more so the request gets a response from the wrong app.
im using python2.7 and webapp2
what is the best way to do this?
edit:
a very simplified version of my runner.py
def main():
if domain == 'www.mydomain_1.com':
from app_1 import application
namespace = 'app_1'
elif domain == 'www.domain_2.com':
from app_2 import application
namespace = 'app_2'
namespace_manager.set_namespace(namespace)
return wsgiref.handlers.CGIHandler().run(application)
if __name__ == '__main__':
main()
and in app.yaml
- url: /.*
script: app-runner.py

Your runner script is a CGI script. The full behavior of a CGI script with multithreading turned on is not documented, and the way the docs are written I'm guessing this won't be supported fully. Instead, the docs say you must refer to the WSGI application object directly from app.yaml, using the module path to a global variable containing the object, when multithreading is turned on. (CGI scripts retain their old behavior in Python 2.7 with multithreading turned off.)
The behavior you're seeing is explained by your use of imports. Within a single instance, each import statement only has an effect the first time it is encountered. After that, the module is assumed to be imported and the import statement has no effect on subsequent requests. You can import both values into separate names, then call run() with the appropriate value.
But if you want to enable multithreading (and that's a good idea), your dispatcher should be a WSGI application itself, stored in a module global referred to by app.yaml. I don't know offhand how to dispatch a request to another WSGI application from within a WSGI application, but that might be a reasonable thing to do. Alternatively, you might consider using or building a layer above WSGI to do this dispatch.

made it happen by subclassing webapp2.WSGIApplication and overriding __call__() which is called before dispatching to a RequestHandler.
prefixing routes (and removing the prefix in the handlers initialize) and substructuring config to be able to use the instance memory.
class CustomWSGIApplication(webapp2.WSGIApplication):
def __call__(self, environ, start_response):
routes, settings, ns = get_app(environ)
namespace_manager.set_namespace(ns)
environ['PATH_INFO'] = '/%s%s' %(ns, environ.get('PATH_INFO'))
for route in routes:
r, h = route # returns a tuple with mapping and handler
newroute = ('/%s%s'%(ns, r), h,)
self.router.add(newroute)
if settings:
self.config[ns] = settings
self.debug = debug
with self.request_context_class(self, environ) as (request, response):
try:
if request.method not in self.allowed_methods:
# 501 Not Implemented.
raise exc.HTTPNotImplemented()
rv = self.router.dispatch(request, response)
if rv is not None:
response = rv
except Exception, e:
try:
# Try to handle it with a custom error handler.
rv = self.handle_exception(request, response, e)
if rv is not None:
response = rv
except HTTPException, e:
# Use the HTTP exception as response.
response = e
except Exception, e:
# Error wasn't handled so we have nothing else to do.
response = self._internal_error(e)
try:
return response(environ, start_response)
except Exception, e:
return self._internal_error(e)(environ, start_response)

Image response in Google App Engine

I am saving a user's image as a BlobProperty by doing:
user.image = urlfetch.fetch(image_url).content
Then I'm rendering that image using a url such as:
/image/user_id
The image must be saving because because when I do len(user.image) I get a number in the thousands. And on the local instance the image renders ok. On the deployed app, I get the following error, and when I go to the image url nothing shows in the browser:
Traceback (most recent call last):
File "/base/python27_runtime/python27_dist/lib/python2.7/wsgiref/handlers.py", line 86, in run
self.finish_response()
File "/base/python27_runtime/python27_dist/lib/python2.7/wsgiref/handlers.py", line 127, in finish_response
self.write(data)
File "/base/python27_runtime/python27_dist/lib/python2.7/wsgiref/handlers.py", line 202, in write
assert type(data) is StringType,"write() argument must be string"
AssertionError: write() argument must be string
Also, here's the handler that serves the image:
class ImageHandler(webapp2.RequestHandler):
""" Returns image based on id. """
def get(self, *args, **kwargs):
user = db.get(
db.Key.from_path('User', models.User.get_key_name(kwargs.get('id'))))
if user.image:
self.response.headers['Content-Type'] = "image/jpeg"
self.response.out.write(user.image)
else:
self.response.out.write("No image")
Just to clarify I tried both setting content-type to jpeg and png. And things are working ok on the local server. Any help would be appreciated. Thanks!

Why not write the image to the blobstore and then use the send_blob() mechanism?
http://code.google.com/appengine/docs/python/blobstore/overview.html#Serving_a_Blob

Answering my question, that fixes everything:
self.response.out.write(str(user.image))
It's confusing because the example in the docs does not cast the BlobProperty as string.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Full Text Search with ndb endpoints_proto_datastore - google-app-engine

If you were using Java then the answer would be to use import com.google.api.server.spi.response.CollectionResponse; In python you need to create Response Message Classes.

Related

Why can't I invoke sagemaker endpoint with either bytes or file as payload

Django Model MultipleObjectsReturned

GAE: 'Response' object has no attribute 'set_cookie'

serve different wsgiapplications depending on request domain on GAE with threadsafe:true

Image response in Google App Engine

Categories

Resources