I am building a mobile application with a server-side backend.
Application has a lot of different functionality + some social network functionality.
The problem I am experiencing is about protecting resources on Amazon S3.
Let's imagine, that user has received an url to Amazon S3, where his friend's picture is stored. The url is something like:
https://bucket-name.amazon.com/some-unique-id-of-the-picture
So now the user can randomly ( or sequentially ) start generating some-unique-id-of-the-picture and get the access to other user's private pictures. How can I restrict this?
Actually, I am more-less OK to that ( these pictures aren't very confidential in our application ), however, some of the user's really confidential data is stored on Amazon S3 as well ( in another bucket ). And I definitely need to secure them.
How can I secure this data?
Using time-limited URLs is not an option.
Here is why:
1) Downloading the resource by time-limited URL and then just use the local copy - slows down the application ( reading BLOB data from SQLite on Android is slow ) and the application takes a lot of storage
2) Caching of the resources and storing just URLs won't work. Since user can empty the cache, but URL will already expire.
At the moment the only solution I see:
Make all the requests via our server, check if the user can see requested resource, then read the data from amazon and return it to the user.
Unfortunately, this would produce quite significant load on our server(s).
Any help/advice or suggestions would be highly appreciated.
Thank you in advance.
Related
Sorry about the "stupid" title, but I don't really know how to explain this.
I want to have a webpage on my site (built in React), that will show the release notes for each version of my site/product. I can hardcode the content of the release notes in the page, but I want to do something that allows me not to have to recompile my site just to change content.
My site is hosted in AWS, so I was thinking if there are any patterns to store the content of the page in an S3 bucket as a text file, or as an entry in DynamoDB.
Does this make sense?
These are things I remember, but I would like to ask how "you" have done this in the past.
Thank you in advance!
You could really use either S3 or DynamoDB, though S3 ends up being more favorable for a few reasons.
For S3, the general pattern would be to store your formatted release notes as an HTML file (or multiple files) as S3 objects and have your site make AJAX requests to the S3 object(s) and load the HTML stored there as your formatted release notes.
Benefits:
You can make the request client-side and asynchronous via AJAX, so the rest of the page load time isn't negatively impacted.
If you want to change the formatting of the release notes, you can do so by just changing the S3 object. No site recompilation required.
S3 is cheap.
If you were to use DynamoDB, you would have to request the contents server-side and format them server-side (the format would not be changeable without site recompilation). You get 25 read capacity units for free, but if your site sees a lot of traffic, you could end up paying much more than you would with S3.
I would like to know how do CDNs serve private data - images / videos. I came across this stackoverflow answer but this seems to be Amazon CloudFront specific answer.
As a popular example case lets say the problem in question is serving contents inside of facebook. So there is access controlled stuff at an individual user level and also at a group of users level. Besides, there is some publicly accessible data.
All logic of what can be served to whom resides on the server!
The first request to CDN will go to application server and gets validated for access rights. But there is a catch - keep this in mind:
Assume that first request is successful and after that, anyone will be able to access the image with that CDN URL. I tested this with Facebook user uploaded restricted image and it was accessible with the CDN URL by others too even after me logging out. So, the image will be accessible till the CDN cache expiry time.
I believe this should work - all requests first come to the main application server. After determining whether access is allowed or not, a redirect to the CDN server or access-denied error can be shown.
Each CDN working differently, so unless you specify which CDN you are looking for its hard to tell.
I'm making a mobile HTML5 webapp and I'm wondering if I can use local storage to enable users to still use the app when they lose internet access.
The basic idea would be that when they have wi-fi / 3G they download the HTML and data, but when they lose internet access they can at least access the last version with old cached data (with a warning that data may not be up to date until they get internet access again).
Is this possible with local storage ?
Certainly. One of the purposes with localStorage is to enable offline applications.
you can check (see here for details):
window.navigator.onLine
to see if you are online or offline, or simply:
window.addEventListener("offline", offlineFunc, false)
window.addEventListener("online", onlineFunc, false)
and if offline you serve the stored content from localStorage by updating the page partially.
Another way of doing this is to use a cache manifest.
Here you can define which files shall be available if browser become offline, and which require network and so forth.
See here for details on that:
https://en.wikipedia.org/wiki/Cache_manifest_in_HTML5
http://diveintohtml5.info/offline.html
Besides from localStorage you can also use IndexedDB which also allow you to store Blobs (or files) (File API is coming, currently only for Chrome).
I have been reading all over stackoverflow concerning datastore vs blobstore for storing and retrieving image files. Everything is pointing towards blobstore except one: privacy and security.
In the datastore, the photos of my users are private: I have full control on who gets a blob. In the blobstore, however, anyone who knows the url can conceivable access my users photos? Is that true?
Here is a quote that is supposed to give me peace of mind, but it's still not clear. So anyone with the blob key can still access the photos? (from Store Photos in Blobstore or as Blobs in Datastore - Which is better/more efficient /cheaper?)
the way you serve a value out of the Blobstore is to accept a request
to the app, then respond with the X-AppEngine-BlobKey header with the
key. App Engine intercepts the outgoing response and replaces the body
with the Blobstore value streamed directly from the service. Because
app logic sets the header in the first place, the app can implement
any access control it wants. There is no default URL that serves
values directly out of the Blobstore without app intervention.
All of this is to ask: Which is more private and more secure for trafficking images, and why: datastore or blobstore? Or, hey, google-cloud-storage (which I know nothing about presently)
If you use google.appengine.api.images.get_serving_url then yes, the url returned is public. However the url returned is not guessable from a blob's key, nor does the url even exist before calling get_serving_url. (Or after calling delete_serving_url).
If you need access control on top of the data in the blobstore you can write your own handlers and add the access control there.
BlobProperty is just as private and secure as BlobStore, all depends on your application which serves the requests. your application can implement any permission checking before sending the contents to the user, so I don't see any difference as long as you serve all the images yourself and don't intentionally create publicly available URLs.
Actually, I would not even thinlk about storing photos in the BlobProperty, because this way the data ends up in the database instead of the BlobStore and it costs significantly more to store data in the database. BlobStore, on the other hand, is cheap and convenient.
I am building a website (probably in Wordpress) which takes data from a number of different sources for display on various pages.
The sources:
A Twitter feed
A Flickr feed
A database on a remote server
A local database
From each source I will mainly retrieve
A short string, e.g. for Twitter, the Tweet, and from the local database the title of a blog page.
An associated image, if one exists
A link identifying the content at its source
My question is:
What is the best way to a) store the data and b) retrieve the data
My thinking is:
i) Write a script that is run every 2 or so minutes on a cron job
ii) the script retrieves data from all sources and stores it in the local database
iii) application code can then retrieve all data from the one source, the local database
This should make application code easier to manage - we only ever draw data from one source in application code - and that's the main appeal. But is it overkill for a relatively small site?
I would recommend putting the twitter feed and flickr feed in JavaScript. Both flickr and twitter have REST APIs. By putting it on the client you free up resources on your server, create less complexity, your users won't be waiting around for your server to fetch the data, and you can let twitter and flickr cache the data for you.
This assumes you know JavaScript. Once you get past JavaScript quirks, it's not a bad language. Give Jquery a try. JQuery Twitter plugin Flickery JQuery plugin. There are others, that's just the first results from Google.
As for your data on the local server and remote server, that will depend more on the data that is being fetched. I would go with whatever you can develop the fastest and gives acceptable results. If that means making a REST call from server to sever, then go for it. IF the remote server is slow to respond, I would go the AJAX REST API method.
And for the local database, you are going to have to write server side code for that, so I would do that inside the Wordpress "framework".
Hope that helps.