I was wondering if anyone knew how long image urls served back from the google app engine blob store remain valid for?
I have been tracking on url that i served an image from the blob store on 1/3/13 and its still there.
I am ask specifically so i can cache the image url instead of attempting to serve it repeatedly. If i did this i would still check if the image is there, but how often would i need to check that
thanks!
They remain valid until either you
a. call delete_serving_url, or
b. delete the underling blob.
Related
I would like to know how do CDNs serve private data - images / videos. I came across this stackoverflow answer but this seems to be Amazon CloudFront specific answer.
As a popular example case lets say the problem in question is serving contents inside of facebook. So there is access controlled stuff at an individual user level and also at a group of users level. Besides, there is some publicly accessible data.
All logic of what can be served to whom resides on the server!
The first request to CDN will go to application server and gets validated for access rights. But there is a catch - keep this in mind:
Assume that first request is successful and after that, anyone will be able to access the image with that CDN URL. I tested this with Facebook user uploaded restricted image and it was accessible with the CDN URL by others too even after me logging out. So, the image will be accessible till the CDN cache expiry time.
I believe this should work - all requests first come to the main application server. After determining whether access is allowed or not, a redirect to the CDN server or access-denied error can be shown.
Each CDN working differently, so unless you specify which CDN you are looking for its hard to tell.
Consider an image (avatar) uploader to Google Cloud Storage which will start from the user's web browser, and then pass through a Go appengine instance which will handle standard compression/cropping etc. and then set the resulting image as an object in Cloud Storage
How can I ensure that the appengine instance isn't overloaded by too much or bad data? In other words, I think I'm asking two questions (or possibly not):
How can I limit the amount of data allowed to be sent to an appengine instance in a single request, or is there already a default safe limit?
How can I validate the data to make sure it's proper jpg/png/gif before attempting to process it with standard go image libraries?
All App Engine requests are limited to 32MB.
You can check the size of the file being uploaded before the upload starts.
You can verify the file's mime-type and only allow correct files to be uploaded.
Using the GoogleStorageTools class's CloudStorageTools::getImageServingUrl and then replacing the storage object of the image with another image of the same name, the old image is still displayed upon subsequent calls of getImageServingUrl
I tried using CloudStorageTools::deleteImageServingUrl and then CloudStorageTools::getImageServingUrl again, but this doesn't work.
Is there any way to interact with Cloud Storage and tell it to refresh the image or the image URL? I'm guessing not, and am going to ensure the filenames are unique, instead, but it feels like there ought to be a way.
If you refresh the image, does the new image show up? It's possible there's a cache-control policy set on the image. Google Cloud Storage allows users to specify what cache-control headers should be sent to browsers, but I'm not sure whether app engine's getImageServingUrl respects that value.
As an experiment, could you try going to console.developers.google.com, heading over to "storage > cloud storage > storage browser", choosing the appropriate object, choosing "edit metadata," and then seeing whether there's a Cache-Control policy on the object? Try changing the cache-control section to "max-age=0,no-cache".
Current procedure to serve image is as follows:
Store image on google cloud storage
Get blob_key: google.appengine.ext.blobstore.create_gs_key(filename)
Get url: google.appengine.api.images.get_serving_url(blob_key,size=250,secure_url=True)
To remove the image, after retrieving the blob_key:
Delete serving url:
google.appengine.api.images.delete_serving_url(blob_key)
Delete google cloud storage file: 'cloudstorage.delete(filename)'
Issue
The issue is that the url is still serving for an undefined amount of time, even though the underlying image does not exist on google cloud storage anymore. Most of the time the url returns 404 in ~24hrs, but have also seen 1 image still serving now (~2wks).
What are the expectations about the promptness of the delete_serving_url call? Any alternatives to delete the url faster?
I can address one of your two questions. Unfortunately, it's the less helpful one. :/
What are the expectations about the promptness of the delete_serving_url call?
Looking at the Java documentation for getServingUrl, they clearly spell out to expect it to take 24 hours, as you observed. I'm not sure why the Python documentation leaves this point out.
If you wish to stop serving the URL, delete the underlying blob key. This takes up to 24 hours to take effect.
The documentation doesn't explain why one of your images would still be serving after 2 weeks.
It is also interesting to note that they don't reference deleteServingUrl as part of the process to stop serving a blob. That suggests to me that step (1) in your process to "delete the image" is unnecessary.
I have been reading all over stackoverflow concerning datastore vs blobstore for storing and retrieving image files. Everything is pointing towards blobstore except one: privacy and security.
In the datastore, the photos of my users are private: I have full control on who gets a blob. In the blobstore, however, anyone who knows the url can conceivable access my users photos? Is that true?
Here is a quote that is supposed to give me peace of mind, but it's still not clear. So anyone with the blob key can still access the photos? (from Store Photos in Blobstore or as Blobs in Datastore - Which is better/more efficient /cheaper?)
the way you serve a value out of the Blobstore is to accept a request
to the app, then respond with the X-AppEngine-BlobKey header with the
key. App Engine intercepts the outgoing response and replaces the body
with the Blobstore value streamed directly from the service. Because
app logic sets the header in the first place, the app can implement
any access control it wants. There is no default URL that serves
values directly out of the Blobstore without app intervention.
All of this is to ask: Which is more private and more secure for trafficking images, and why: datastore or blobstore? Or, hey, google-cloud-storage (which I know nothing about presently)
If you use google.appengine.api.images.get_serving_url then yes, the url returned is public. However the url returned is not guessable from a blob's key, nor does the url even exist before calling get_serving_url. (Or after calling delete_serving_url).
If you need access control on top of the data in the blobstore you can write your own handlers and add the access control there.
BlobProperty is just as private and secure as BlobStore, all depends on your application which serves the requests. your application can implement any permission checking before sending the contents to the user, so I don't see any difference as long as you serve all the images yourself and don't intentionally create publicly available URLs.
Actually, I would not even thinlk about storing photos in the BlobProperty, because this way the data ends up in the database instead of the BlobStore and it costs significantly more to store data in the database. BlobStore, on the other hand, is cheap and convenient.