Django Model MultipleObjectsReturned - sql-server

I am using django ORM to talk to a SQL Server DB.
I used the .raw() method to run a query:
#classmethod
def execute_native_queries(cls, query):
return cls.objects.raw(query)
I later type-cast it into a list by running: data = list(modelName.execute_native_queries(query))
While iterating through the list, i would call certain columns as such:
for entry in data:
a = entry.colA
b = entry.colB
c = entry.colC
For certain entries, I am able to run one loop-iteration fine, however for some i get the following error:
api.models.modelName.MultipleObjectsReturned: get() returned more than one modelName-- it returned 2!
What i do not get is how come this error is surfacing?
EDIT: Added the stacktrace
Traceback (most recent call last):
File "<full filepath>\a.py", line 178, in method1
'vc': data.vc,
File "C:\FAST\Python\3.6.4\lib\site-packages\django\db\models\query_utils.py", line 137, in __get__
instance.refresh_from_db(fields=[self.field_name])
File "C:\FAST\Python\3.6.4\lib\site-packages\django\db\models\base.py", line 605, in refresh_from_db
db_instance = db_instance_qs.get()
File "C:\FAST\Python\3.6.4\lib\site-packages\django\db\models\query.py", line 403, in get
(self.model._meta.object_name, num)
api.models.modelName.MultipleObjectsReturned: get() returned more than one modelName-- it returned 2!

Related

How to copy DB from disk to memory with IronPython2

I'm trying to figure out how to copy an existing database file on disk to memory to make queries on it faster. I know how to do this in CPython3 with:
import sqlite3
db_path = r"C:\path to\database.db"
db_disk = sqlite3.connect(db_path)
db_memory = sqlite3.connect(':memory:')
db_disk.backup(db_memory)
but the .backup() function doesn't exist in IronPython2 (SQLite library version 3.7.7).
Through various researching, I've tried:
import clr
clr.AddReference('IronPython.SQLite.dll')
import sqlite3
db_path = r"C:\path to\database.db"
db_disk = sqlite3.connect(db_path)
db_memory = sqlite3.connect(':memory:')
script = ''.join(db_server.iterdump())
db_memory.executescript(script)
and
db_server = sqlite3.connect(db_path)
db_memory = sqlite3.connect(':memory:')
script = "".join(line for line in db_server.iterdump())
db_memory.executescript(script)
But I keep getting an error at the line script = ''.join(db_server.iterdump()) or script = "".join(line for line in db_server.iterdump()):
Warning: IronPythonEvaluator.EvaluateIronPythonScript operation failed.
Traceback (most recent call last):
File "<string>", line 72, in <module>
NotImplementedError: Not supported with C#-sqlite for unknown reasons.
The code above came from seeing these posts:
Post 1
Post 2
Post 3
I was going to try the solution in this post, but I don't have apsw and I can't load any packages.
I was also going to try the solition in post 1 above, but again, I can't get pandas.io.sql or sqlalchemy.
Can anyone point me to a snippet of code that accomplishes this or correct my current code?
Thanks.

Why can't I invoke sagemaker endpoint with either bytes or file as payload

I have deployed a linear regression model on Sagemaker. Now I want to write a lambda function to make prediction on input data. Files are pulled from S3 first. Some preprocessing is done and the final input is a pandas dataframe. According to boto3 sagemaker documentation, the payload can either be byte-like, or file. So I have tried to convert the dataframe to a byte array using code from this post
# Convert pandas dataframe to byte array
pred_np = pred_df.to_records(index=False)
pred_str = pred_np.tostring()
# Start sagemaker prediction
sm_runtime = aws_session.client('runtime.sagemaker')
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=pred_str,
ContentType='text/csv',
Accept='Accept')
I printed out pred_str which does seem like a byte array to me.
However when I run it, I got the following Algorithm Error caused by UnicodeDecodeError:
Caused by: 'utf8' codec can't decode byte 0xed in position 9: invalid continuation byte
The traceback shows python 2.7 not sure why that is:
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/serve.py", line 465, in invocations
data_iter = get_data_iterator(payload, **content_parameters)
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/io/serve_helpers.py", line 99, in iterator_csv_dense_rank_2
payload = payload.decode("utf8")
File "/opt/amazon/python2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
Is the default decoder utf_8? What is the right decoder I should be using? Why is it complaining about position 9?
In addition, I also tried to save the dataframe to csv file and use that as payload
pred_df.to_csv('pred.csv', index=False)
with open('pred.csv', 'rb') as f:
payload = f.read()
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=payload,
ContentType='text/csv',
Accept='Accept')
However when I ran it I got the following error:
Customer Error: Unable to parse payload. Some rows may have more columns than others and/or non-numeric values may be present in the csv data.
And again, the traceback is calling python 2.7:
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/serve.py", line 465, in invocations
data_iter = get_data_iterator(payload, **content_parameters)
File "/opt/amazon/lib/python2.7/site-packages/ai_algorithms_sdk/io/serve_helpers.py", line 123, in iterator_csv_dense_rank_2
It doesn't make sense at all because it is standard 6x78 dataframe. All rows have same number of columns. Plus none of the columns are non-numeric.
How to fix this sagemaker issue?
I was finally able to make it work with the following code:
payload = io.StringIO()
pred_df.to_csv(payload, header=None, index=None)
sm_runtime = aws_session.client('runtime.sagemaker')
response = sm_runtime.invoke_endpoint(
EndpointName=SAGEMAKER_ENDPOINT,
Body=payload.getvalue(),
ContentType='text/csv',
Accept='Accept')
It is very import to call getvalue() function for the payload while invoking the endpoint. Hope this helps

TransientError: Temporary search service error - Python Google App Engine Search

I'm trying to run a query and sorting by 'updated_at' field using a large number_found_accuracy:
order_options = search.SortOptions(
expressions=[search.SortExpression(expression='updated_at',
direction=search.SortExpression.DESCENDING)])
query_options = search.QueryOptions(
limit=50,
cursor=search.Cursor(),
sort_options=order_options,
number_found_accuracy=25000)
index = search.Index('contacts', namespace='default')
query_future = index.search_async(search.Query("", options=query_options))
contacts = query_future.get_result()
When get_result() is called i get the error bellow:
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1535, in call
rv = self.handle_exception(request, response, e)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1529, in call
rv = self.router.dispatch(request, response)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1102, in call
return handler.dispatch()
File "/base/data/home/apps/p~imobzi-app/20181127t101400.414282042583891084/modules/base_handler.py", line 72, in dispatch
super(BaseHandler, self).dispatch()
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "/base/data/home/apps/p~imobzi-app/20181127t101400.414282042583891084/main.py", line 132, in get
contacts = query_future.get_result()
File "/base/alloc/tmpfs/dynamic_runtimes/python27g/3b44e98ed7fbb86b/python27/python27_lib/versions/1/google/appengine/api/search/search.py", line 281, in get_result
raise _ToSearchError(e)
TransientError: Temporary search service error
The error occurs when i use "number_found_accuracy" and "sort_options" in same query when query result is large (this query return more than 50,000 results).
If "number_found_accuracy" or "sort_options" is removed from query_options i get the result normally, but if both is in query_options, the error occurs.
In a normal case i'd remove "number_found_accuracy" from query, but i need to show the result count for the user and sort it by updated_at field. Does anyone know a way to solve this? This occurs only when a deploy project to server, in a local/development environment, everything works as expected.
One reason for this error can be that the length of the query exceeds the limit of 2000 characters as specified in the documentation.
There is also a limitation on sorting that can be worked around by presorting your documents in the index using Document rank as explained in this StackOverflow answer.
Also note that as per the documentation
number_found:
Returns an approximate number of documents matching the query. QueryOptions defining post-processing of the search results. If the QueryOptions.number_found_accuracy parameter were set to 100, then number_found <= 100 is accurate.
Since you have set number_found_accuracy=25000, any search result with a size greater than 25000 will be an approximate.

Full Text Search with ndb endpoints_proto_datastore

I've been using endpoints_proto_datastore library to build my endpoints APIs and I am trying to figure out how to return a list of records result that I've retrieved from the search API.
The #query_method seems to require a Query type to return and it'll do the fetch call internally. How would I go about implementing an endpoint method that would handle full-text search? Do I just define a custom protorpc requets Message and response Message and skip the endpoints_proto_datastore library all together?
This is what I tried and got an error that list doesn't have ToMessage attribute.
Encountered unexpected error from ProtoRPC method implementation: AttributeError ('list' object has no attribute 'ToMessage')
Traceback (most recent call last):
File "google_appengine/lib/protorpc-1.0/protorpc/wsgi/service.py", line 181, in protorpc_service_app
response = method(instance, request)
File "google_appengine/lib/endpoints-1.0/endpoints/api_config.py", line 1329, in invoke_remote
return remote_method(service_instance, request)
File "google_appengine/lib/protorpc-1.0/protorpc/remote.py", line 412, in invoke_remote_method
response = method(service_instance, request)
File "third_party/py/endpoints_proto_datastore/ndb/model.py", line 1416, in EntityToRequestMethod
response = response.ToMessage(fields=response_fields)
AttributeError: 'list' object has no attribute 'ToMessage'
Here's a general view of the code:
class MyModel(EndpointsModel):
SearchSchema = MessageFieldsSchema(('q',))
_query_string = None
def QueryStringSet_(self, value):
self._query_string = value
#EndpointsAliasProperty(name='q', setter=QueryStringSet_)
def query_string(self):
return self._query_string
class MyServices(...):
#MyModel.method(
request_fields=MyModel.SearchSchema,
name='search', path='mymodel/search')
def SearchMyModel(self, request):
return MyModel.Search(request.q)
If you were using Java then the answer would be to use
import com.google.api.server.spi.response.CollectionResponse;
In python you need to create Response Message Classes.

How can I read a blob that was written to the Blobstore by a Pipeline within the test framework?

I have a pipeline that creates a blob in the blobstore and places the resulting blob_key in one of its named outputs. When I run the pipeline through the web interface I have built around it, everything works wonderfully. Now I want to create a small test case that will execute this pipeline, read the blob out from the blobstore, and store it to a temporary location somewhere else on disk so that I can inspect it. (Since testbed.init_files_stub() only stores the blob in memory for the life of the test).
The pipeline within the test case seems to work fine, and results in what looks like a valid blob_key, but when I pass that blob_key to the blobstore.BlobReader class, it cannot find the blob for some reason. From the traceback, it seems like the BlobReader is trying to access the real blobstore, while the writer (inside the pipeline) is writing to the stubbed blobstore. I have --blobstore_path setup on dev_appserver.py, and I do not see any blobs written to disk by the test case, but when I run it from the web interface, the blobs do show up there.
Here is the traceback:
Traceback (most recent call last):
File "/Users/mattfaus/dev/webapp/coach_resources/student_use_data_report_test.py", line 138, in test_serial_pipeline
self.write_out_blob(stage.outputs.xlsx_blob_key)
File "/Users/mattfaus/dev/webapp/coach_resources/student_use_data_report_test.py", line 125, in write_out_blob
writer.write(reader.read())
File "/Users/mattfaus/Desktop/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/blobstore/blobstore.py", line 837, in read
self.__fill_buffer(size)
File "/Users/mattfaus/Desktop/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/blobstore/blobstore.py", line 809, in __fill_buffer
self.__position + read_size - 1)
File "/Users/mattfaus/Desktop/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/blobstore/blobstore.py", line 657, in fetch_data
return rpc.get_result()
File "/Users/mattfaus/Desktop/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 604, in get_result
return self.__get_result_hook(self)
File "/Users/mattfaus/Desktop/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/blobstore/blobstore.py", line 232, in _get_result_hook
raise _ToBlobstoreError(err)
BlobNotFoundError
Here is my test code:
def write_out_blob(self, blob_key, save_path='/tmp/blob.xlsx'):
"""Reads a blob from the blobstore and writes it out to the file."""
print str(blob_key)
# blob_info = blobstore.BlobInfo.get(str(blob_key)) # Returns None
# reader = blob_info.open() # Returns None
reader = blobstore.BlobReader(str(blob_key))
writer = open(save_path, 'w')
writer.write(reader.read())
print blob_key, 'written to', save_path
def test_serial_pipeline(self):
stage = student_use_data_report.StudentUseDataReportSerialPipeline(
self.query_config)
stage.start_test()
self.assertIsNotNone(stage.outputs.xlsx_blob_key)
self.write_out_blob(stage.outputs.xlsx_blob_key)
It might be useful if you show how you finalized the blobstore file or if you can try that finalization code separately. It sounds Files API didn't finalize the file correctly on dev appserver.
Turns out that I was simply missing the .value property, here:
self.assertIsNotNone(stage.outputs.xlsx_blob_key)
self.write_out_blob(stage.outputs.xlsx_blob_key.value) # Don't forget .value!!
[UPDATE]
The SDK dashboard also exposes a list of all blobs in your blobstore, conveniently sorted by creation date. It is available at http://127.0.0.1:8000/blobstore.

Resources