Django Tastypie prevent file uri's being saved to a FileField

Django Tastypie prevent file uri's being saved to a FileField - django-models

I've got a Django app with Tastypie, and mainly BackBone client side. One of my models has a few ImageFields. Here is a similar setup to help me explain the issue.
settings.py
MEDIA_URL = "/media/"
models.py
class Foo(models.model):
bar = models.ImageField()
baz = models.CharField()
api.py
class FooResource(ModelResource):
class Meta:
queryset=models.Foo.objects.all()
resource_name = "foo"
authorization = Authorization()
When I make a GET request to the API, it appends the MEDIA_URL to the file names to return the URI where a bar can be accessed. However, when I change the value of baz on a row, and then make a PUT request with that, it also changes the value for a bar to the URI. This means that the next time I GET the row, it appends the MEDIA_URL again, breaking the system and appending it for each successive GET and PUT. I end up with values for bar in the DB that look like.
/media/media/media/bar.jpg
I think I should fix this by overriding a method in my ModelResource, so that when there is a PUT request, it recognizes that it's getting either a URI or a real file, and alters its behavior in some way.
Is this the correct fix? Could you provide some implementation details of a fix?

I found the answer. Tastypie is well designed, similarly to Django. Unfortunately I was not familiar with the terminology so when I read the docs I didn't understand. You can easily modify behavior of the API at many levels. Here is my new API definition, which fixed the issue.
api.py
class FooResource(ModelResource):
class Meta:
queryset=models.Foo.objects.all()
resource_name = "foo"
authorization = Authorization()
def hydrate_bar(bundle):
bundle["bar"] = bundle["bar"].strip(MEDIA_URL)
return bundle
I should add that this only works for me because I exclusively POST my image files individually with a post_detail method which doesn't call this method. If I was to POST or PUT image files as part of the entire row, I expect this might raise an error if that isn't considered.

Related

How To Programmatically Set Up Wagtail Root Page For Tests Utilizing The StaticLiveServerTestCase Suite

This question is similar to another question here on stack overflow .
I am in the process of adding tests to my wagtail site utilizing Django's StaticLiveServerTestCase. Below is an example of the code base I have at hand:
class ExampleTest(StaticLiveServerTestCase):
def setUp(self):
self.browser = webdriver.Chrome()
def test_example_test(self):
self.assertContains("Contact Page", self.browser.content)
[...]
So when I run this test with python manage.py test, the test fails because I there is a 500 error. Please recall that I am using wagtail and NOT Vanilla Django alone. I am also using Django's Site framework as opposed to Wagtail's Site framework as allauth only allows for usage with Django's Site framework.
After applying the #override_settings(DEBUG=True) to the test like this:
#override_settings(DEBUG=True)
class ExampleTest(StaticLiveServerTestCase):
def setUp(self):
self.browser = webdriver.Chrome()
def test_example_test(self):
self.assertContains("Contact Page", self.browser.content)
[...]
The test still fails as the page that is being loaded is the wagtail default page.
My question is, how do I set up another page as the root/default wagtail page such that when a request to localhost:8000 [or any other port number given by the test server] is being made to the home page (i.e. http://localhost:8000/), I see that new page instead of the wagtail default page?
Thanks.

Since StaticLiveServerTestCase creates a new [temporary] "test" database [including running migrations and migrate], wagtail literally resets all sites and pages back to it's initial state after initial wagtail start [mysite] command.
This means that if you have any other Page that you would like to be the root page, you will have to hard code the instruction to do that.
Below is a way in which this can be achieved.
It is advisable to set these instructions within the setUpClass method of a class — usually a class Main() class where other test classes can inherit from; thereby encouraging D.R.Y.
class Main(StaticLiveServerTestCase):
#classmethod
def setUpClass(cls):
super(Main, cls).setUpClass()
cls.root = Page.objects.get(id=1).specific
cls.new_default_home_page = Index(
title="New Home Page Index",
slug="index",
)
cls.root.add_child(instance=cls.new_default_home_page)
cls.site = Site.objects.get(id=1)
cls.site.root_page = cls.new_default_home_page
cls.site.save()
cls.browser = Chrome()
Now my test classes (wherever they are) can inherit from this class and get the entire new home page setup instantly. For example:
# ExampleTest() inherits from Main() for ease of Wagtail Page setup: avoiding repetition of setUpClass().
class ExampleTest(Main):
def test_example_test(self):
self.assertContains("Contact Page", self.browser.title)
[...]
Hope this helps someone out there someday.
THIS SOLUTION IS VALID FOR: wagtail==2.7.4. anything above this version isn't guaranteed to work as wagtail's code base dictates. However, it's very unlikely that this wouldn't work.

Post an ndb StructuredProperty _Message__decoded_fields

My Problem:
I am attempting to fill a datastore model in GAE that contains an ndb.Structured Property() using a 'POST' request.
This question has been asked recently but not answered (How to “POST” ndb.StructuredProperty?)
I have the following two models:
class Check(EndpointsModel):
this = ndb.StringProperty()
that = ndb.StringProperty()
class CheckMessage(EndpointsModel):
check = ndb.StructuredProperty(Check)
I'm trying to post this data:
{
check:
{
"this":"test",
"that":"test"
}
}
with the following API request:
#CheckMessage.method(name='check_insert',path='check/insert',http_method='POST')
def check_insert(self,request):
print(request)
Upon posting from the client I receive the following error:
AttributeError: 'Check' object has no attribute '_Message__decoded_fields'
The Issue:
From my very high-level understanding of the endpoints-proto-datastore module it seems that when the json is being decoded and saved onto the incoming message (utils.py line 431) it's not checking for structured/localstructured properties and saving their keys as well which is all fine and dandy until FromValue (ndb/model.py line 115) checks for instances of structured properties and attempts to recursively convert the structured property from the protorpc message into a model entity (which needs the _Message__decoded_fields).
Sasxa (see the link above) had found a nifty little workaround to this issue by using an EndpointsAliasProperty converted to a ProtoRPC message class to bypass endpoints-proto-datastore's automatic conversion of the structuredproperty into its associated model entity, however this workaround had some side effects that made what I was trying to do difficult.
The Question:
Does anyone know how to correctly fill a datastore model containing a StructuredProperty using a 'POST' request, and are there any working examples of this available?

Provide a callback URL in Google Cloud Storage signed URL

When uploading to GCS (Google Cloud Storage) using the BlobStore's createUploadURL function, I can provide a callback together with header data that will be POSTed to the callback URL.
There doesn't seem to be a way to do that with GCS's signed URL's
I know there is Object Change Notification but that won't allow the user to provide upload specific information in the header of a POST, the way it is possible with createUploadURL's callback.
My feeling is, if createUploadURL can do it, there must be a way to do it with signed URL's, but I can't find any documentation on it. I was wondering if anyone may know how createUploadURL achieves that callback calling behavior.
PS: I'm trying to move away from createUploadURL because of the __BlobInfo__ entities it creates, which for my specific use case I do not need, and somehow seem to be indelible and are wasting storage space.
Update: It worked! Here is how:
Short Answer: It cannot be done with PUT, but can be done with POST
Long Answer:
If you look at the signed-URL page, in front of HTTP_Verb, under Description, there is a subtle note that this page is only relevant to GET, HEAD, PUT, and DELETE, but POST is a completely different game. I had missed this, but it turned out to be very important.
There is a whole page of HTTP Headers that does not list an important header that can be used with POST; that header is success_action_redirect, as voscausa correctly answered.
In the POST page Google "strongly recommends" using PUT, unless dealing with form data. However, POST has a few nice features that PUT does not have. They may worry that POST gives us too many strings to hang ourselves with.
But I'd say it is totally worth dropping createUploadURL, and writing your own code to redirect to a callback. Here is how:
Code:
If you are working in Python voscausa's code is very helpful.
I'm using apejs to write javascript in a Java app, so my code looks like this:
var exp = new Date()
exp.setTime(exp.getTime() + 1000 * 60 * 100); //100 minutes
json['GoogleAccessId'] = String(appIdentity.getServiceAccountName())
json['key'] = keyGenerator()
json['bucket'] = bucket
json['Expires'] = exp.toISOString();
json['success_action_redirect'] = "https://" + request.getServerName() + "/test2/";
json['uri'] = 'https://' + bucket + '.storage.googleapis.com/';
var policy = {'expiration': json.Expires
, 'conditions': [
["starts-with", "$key", json.key],
{'Expires': json.Expires},
{'bucket': json.bucket},
{"success_action_redirect": json.success_action_redirect}
]
};
var plain = StringToBytes(JSON.stringify(policy))
json['policy'] = String(Base64.encodeBase64String(plain))
var result = appIdentity.signForApp(Base64.encodeBase64(plain, false));
json['signature'] = String(Base64.encodeBase64String(result.getSignature()))
The code above first provides the relevant fields.
Then creates a policy object. Then it stringify's the object and converts it into a byte array (you can use .getBytes in Java. I had to write a function for javascript).
A base64 encoded version of this array, populates the policy field.
Then it is signed using the appidentity package. Finally the signature is base64 encoded, and we are done.
On the client side, all members of the json object will be added to the Form, except the uri which is the form's address.
var formData = new FormData(document.forms.namedItem('upload'));
var blob = new Blob([thedata], {type: 'application/json'})
var keys = ['GoogleAccessId', 'key', 'bucket', 'Expires', 'success_action_redirect', 'policy', 'signature']
for(field in keys)
formData.append(keys[field], url[keys[field]])
formData.append('file', blob)
var rest = new XMLHttpRequest();
rest.open('POST', url.uri)
rest.onload = callback_function
rest.send(formData)
If you do not provide a redirect, the response status will be 204 for success. But if you do redirect, the status will be 200. If you got 403 or 400 something about the signature or policy maybe wrong. Look at the responseText. If is often helpful.
A few things to note:
Both POST and PUT have a signature field, but these mean slightly different things. In case of POST, this is a signature of the policy.
PUT has a baseurl which contains the key (object name), but the URL used for POST may only include bucket name
PUT requires expiration as seconds from UNIX epoch, but POST wants it as an ISO string.
A PUT signature should be URL encoded (Java: by wrapping it with a URLEncoder.encode call). But for POST, Base64 encoding suffices.
By extension, for POST do Base64.encodeBase64String(result.getSignature()), and do not use the Base64.encodeBase64URLSafeString function
You cannot pass extra headers with the POST; only those listed in the POST page are allowed.
If you provide a URL for success_action_redirect, it will receive a GET with the key, bucket and eTag.
The other benefit of using POST is you can provide size limits. With PUT however, if a file breached your size restriction, you can only delete it after it was fully uploaded, even if it is multiple-tera-bytes.
What is wrong with createUploadURL?
The method above is a manual createUploadURL.
But:
You don't get those __BlobInfo__ objects which create many indexes and are indelible. This irritates me as it wastes a lot of space (which reminds me of a separate issue: issue 4231. Please go give it a star)
You can provide your own object name, which helps create folders in your bucket.
You can provide different expiration dates for each link.
For the very very few javascript app-engineers:
function StringToBytes(sz) {
map = function(x) {return x.charCodeAt(0)}
return sz.split('').map(map)
}

You can include succes_action_redirect in a policy document when you use GCS post object.
Docs here: Docs: https://cloud.google.com/storage/docs/xml-api/post-object
Python example here: https://github.com/voscausa/appengine-gcs-upload
Example callback result:
def ok(self):
""" GCS upload success callback """
logging.debug('GCS upload result : %s' % self.request.query_string)
bucket = self.request.get('bucket', default_value='')
key = self.request.get('key', default_value='')
key_parts = key.rsplit('/', 1)
folder = key_parts[0] if len(key_parts) > 1 else None

A solution I am using is to turn on Object Changed Notifications. Any time an object is added, a Post is sent to a URL - in my case - a servlet in my project.
In the doPost() I get all info of objected added to GCS and from there, I can do whatever.
This worked great in my App Engine project.

Using bottle.py and blobstore GAE

I recently started using bottle and GAE blobstore and while I can upload the files to the blobstore I cannot seem to find a way to download them from the store.
I followed the examples from the documentation but was only successful on the uploading part. I cannot integrate the example in my app since I'm using a different framework from webapp/2.
How would I go about creating an upload handler and download handler so that I can get the key of the uploaded blob and store it in my data model and use it later in the download handler?
I tried using the BlobInfo.all() to create a query the blobstore but I'm not able to get the key name field value of the entity.
This is my first interaction with the blobstore so I wouldn't mind advice on a better approach to the problem.

For serving a blob I would recommend you to look at the source code of the BlobstoreDownloadHandler. It should be easy to port it to bottle, since there's nothing very specific about the framework.
Here is an example on how to use BlobInfo.all():
for info in blobstore.BlobInfo.all():
self.response.out.write('Name:%s Key: %s Size:%s Creation:%s ContentType:%s<br>' % (info.filename, info.key(), info.size, info.creation, info.content_type))

for downloads you only really need to generate a response that includes the header "X-AppEngine-BlobKey:[your blob_key]" along with everything else you need like a Content-Disposition header if desired. or if it's an image you should probably just use the high performance image serving api, generate a url and redirect to it.... done
for uploads, besides writing a handler for appengine to call once the upload is safely in blobstore (that's in the docs)
You need a way to find the blob info in the incoming request. I have no idea what the request looks like in bottle. The Blobstoreuploadhandler has a get_uploads method and there's really no reason it needs to be an instance method as far as I can tell. So here's an example generic implementation of it that expects a webob request. For bottle you would need to write something similar that is compatible with bottles request object.
def get_uploads(request, field_name=None):
"""Get uploads for this request.
Args:
field_name: Only select uploads that were sent as a specific field.
populate_post: Add the non blob fields to request.POST
Returns:
A list of BlobInfo records corresponding to each upload.
Empty list if there are no blob-info records for field_name.
stolen from the SDK since they only provide a way to get to this
crap through their crappy webapp framework
"""
if not getattr(request, "__uploads", None):
request.__uploads = {}
for key, value in request.params.items():
if isinstance(value, cgi.FieldStorage):
if 'blob-key' in value.type_options:
request.__uploads.setdefault(key, []).append(
blobstore.parse_blob_info(value))
if field_name:
try:
return list(request.__uploads[field_name])
except KeyError:
return []
else:
results = []
for uploads in request.__uploads.itervalues():
results += uploads
return results

For anyone looking for this answer in future, to do this you need bottle (d'oh!) and defnull's multipart module.
Since creating upload URLs is generally simple enough and as per GAE docs, I'll just cover the upload handler.
from bottle import request
from multipart import parse_options_header
from google.appengine.ext.blobstore import BlobInfo
def get_blob_info(field_name):
try:
field = request.files[field_name]
except KeyError:
# Maybe form isn't multipart or file wasn't uploaded, or some such error
return None
blob_data = parse_options_header(field.content_type)[1]
try:
return BlobInfo.get(blob_data['blob-key'])
except KeyError:
# Malformed request? Wrong field name?
return None
Sorry if there are any errors in the code, it's off the top of my head.

How to pass a google mapreduce parameter to done_callback

I'm having trouble setting a parameter when kicking off a mapreduce via start_map so I can access it in done_callback. Numerous things I've read imply that it's possible, but somehow I've not got the earth-moon-stars properly aligned. Ultimately, what I'm trying to accomplish is to delete the temporary blob I created for the mapreduce job.
Here's how I kick it off:
mrID = control.start_map(
"Find friends",
"findfriendshandler.findFriendHandler",
"mapreduce.input_readers.BlobstoreLineInputReader",
{"blob_keys": blobKey},
shard_count=7,
mapreduce_parameters={'done_callback': '/fnfrdone','blobKey': blobKey})
In done_callback, the context object isn't available:
class FindFriendsDoneHandler(webapp.RequestHandler):
def post(self):
ctx = context.get()
if ctx is not None:
params = ctx.mapreduce_spec.mapper.params
try:
blobKey = params['blobKey']
logging.info(['BLOBKEY ' + blobKey])
except KeyError:
logging.info('blobKey key not found in params')
else:
logging.info('context.get did not work') #THIS IS WHAT GETS OUTPUT
Thanks!
EDIT: It seems like there may be more than one MR library, so I wanted to include my various imports:
from mapreduce import control
from mapreduce import operation as op
from mapreduce import context
from mapreduce import model

Below is the code I used in my done_callback handler to retrieve my blobKey user parameter:
class FindFriendsDoneHandler(webapp.RequestHandler):
mrID = self.request.headers['Mapreduce-Id']
try:
mapreduceState = MapreduceState.get_by_key_name(mrID)
mrSpec = mapreduceState.mapreduce_spec
jsonSpec = mrSpec.to_json()
jsonParams = jsonSpec['params']
blobKey = jsonParams['blobKey']
blobInfo = BlobInfo.get(blobKey)
blobInfo.delete()
logging.info('Temp blob deleted successfully for mapreduce:' + mrID)
except:
logging.warning('Unable to delete temp blob for mapreduce:' + mrID)
This uses the mapreduce ID passed into done callback via the header to retrieve the mapreduce state model object from the mapreduce state table. The model stores any user params sent via start_map in a mapreduce_spec property which is in json format.
Note that MR, itself, actually stores the blob_key elsewhere in mapreduce_spec.
Thanks again to #Nick for pointing me to the model.py source file.
I'd love to hear if there's a simpler way to get at MR user params...

Context is only available to mappers/reducers - it's largely concerned with things that don't make sense outside the context of one. As you can see from the source, however, the "Mapreduce-Id" header is set, from which you can get the ID of the mapreduce job.
You shouldn't have to do your own cleanup, though - mapreduce has a handler that does it for you.