Only allow S3 bucket access to authenticated users from specified domain - reactjs

I'm currently building a react project, which is embeds small, static applications that are uploaded to an S3 bucket.
Those applications all are built using HTML/CSS/Vanilla JS, meaning they all have the same structure with an index.html file as an entry point.
Embedding those applications on the site through an iframe with the source link to the index.html works well, but I now want to make sure only users, who are registered and have the correct access rights have access to a given application.
Originally I wanted to handle this using pre-signed urls, but this doesn't seem to work since I couldn't find a way to use a pre-signed url to get access to all the files in a folder in S3.
I then thought about handling everything in React/Express, by making sure the user is authenticated, has the correct role and only then sending the src link back to the Frontend, where it gets embedded in the iframe. Additional I would add a bucket policy that only allows my specific domain to fetch the resources.
Apparently (from other threads) I saw that it's easy to spoof the HTTP referrer, meaning that if somebody gets the access link to the application on S3 they could simply send an HTTP request with a spoofed referrer and get their hands on the content.
I'm in over my head here and trying to figure out what the best architecture is. If it's something completely removed from the setup I currently have I'm happy to change it all around.
Generally though I would hope for something that just gives me an added layer of security, that makes it impossible to access the content in the S3 bucket unless it's coming directly from one specific host after authenticating there.

Related

Share files from a personal computer through a web page (with authorized access)

My internship tutor wants to make available, by means of a website (with authorized access), some of his porfessional files to be available for some of his coworkers?
I tried suggesting several options: FTP client, NAS device or Router with USB.
But he want to do it through a website that we are constructing right now, after of course, succesful login identification.
Is there a solution to what he wants?
You can try Firebase's Cloud storage at https://firebase.google.com/docs/storage/
You just need to create a project and include javascript code in your html page.
It also allows you to define authentication rules about who can view or add/edit files stored.

Access Google Cloud Storage from web application, always 403

I am work on a web application as an interface with Google Cloud Storage(GCS).
I am using a backend service to retrieve the list of files I stored on GCS and their URL with the JSON API and return that to my web application. However, I was not really able to load the files through those URL, which always came back with 403 forbidden.
I am not sure how GCS authentication work behind the scene and whether it is possible to directly grant access to web application. I am not sure how could I attach application authentication information via http request. What I know is I can do that via the backend service but for the reason of simplicity I wonder if it is possible to get around with that. One of the thing I tried is adding the web application domain(which will be sent via referrer in http request) into ACL to that bucket, which doesn't work at all.
And thanks to what #Brandon pointed out below. I am ok to grant anyone whoever have access to the application to view the content of the GCS since it is an internal app and I have already checked their authentication when I first serve the web application.
====
Solution
I ended up using the signedUrl that expire in 5 minutes and I highly recommend interact with gcs using gcloud (Their python document is really good). Thanks again for the thorough answer!
You have a user on a web browser who wants to download an object that only your application's service account has read access for. You have a few options:
Expand access: make these object publicly readable. Probably not the best choice if this info is sensitive, but if it's not, this is the easiest solution.
Give your app's credentials to the user so that they can authenticate as your app. This is a REALLY bad idea, and I probably shouldn't even list it here.
When a user wants to download a file, have them ask your app for it, and then have your app fetch the file and stream its contents to the user. This is the easiest solution for the client-side code, but it makes your app responsible for streaming file contents, which isn't really great.
When a user wants to download a file, have them ask your app for permission, and reply to them with some sort of token they can use to fetch the data directly from GCS.
#4 is what you want. Your users will ask your app for a file, your app will decide whether they are allowed to access that file via whatever you're doing (passwords? IP checks? Cookies? Whatever.) Then, your app will respond with a URL the user can use to fetch the file directly from GCS.
This URL is called a "signed URL." Your app uses its own private key to add a signature to a URL that indicates which object may be downloaded by the bearer and for long the URL is valid. The procedure for signing URLs is somewhat tricky, but fortunately the gcloud storage libraries have helper functions that can generate them.

How to use NodeJS to combat social sharing and search engines issues when using single-page frameworks like AngularJS

I read an article about social sharing issues in AngularJS and how to combat by using Apache as a proxy.
The solution is usable for small websites. But if a web app has 20+ different pages, I have to url-write and create static files for all of them. Moreover, a different stack is added to the app by using PHP and Apache.
Can we use NodeJS as the proxy and re-write the url, and what's the approach?
Is there a way to minimize static files creation?
Is there a way to remove proxy, url-rewrite, and static files all together? For example, inside our NodeJS app to check the user agent, if it is facebook bot or twitter and the like, we use request module to download our page and return the raw html code for them, is it a plausible solution?
Normally when someone shares a url in a social network, that social network request that page to generate a preview/thumbnail (aka "scrape").
Most likely those scrapers won't run javascript, so they need a static html version of that page.
Same applies for search engines (even though Google and others are starting to support javascript sites).
Here's a good approach for an SPA to still support scrapers:
use history.pushState in angular to get virtual urls when navigating thru your app (ie. urls without a #)
server-side (node.js or any), detect if a request comes from a user or a bot (eg. check the User-Agent using this lib https://www.npmjs.com/package/is-bot )
if the request url has a file extension, it's probably a static resource request (images, .css, .js), proxy to get the static file
if the request url is a page, for real users, if the url is a page (ie. not a static resource) always serve your index.html that loads your angular app (pro tip: keep this file cached in memory)
if the request url is a page, serve a pre-rendered version of the requested url (they won't run javascript), this is the hard part (side note: ReactJS makes this problem much simpler), you can use a service like https://prerender.io/ they'd take care of loading your angular app, and saving each page as html (if you're curious, they use a headless/virtual browser in memory called PhantomJS to do that, simulating what a real user would do clicking "Save As..."), then you can request and proxy those prerendered pages to bot requests (like social network scrappers). If you want, it's possible to run a prerender instance on your own servers.
All this server-side process I described is implemented in this express.js middleware by prerender:
https://github.com/prerender/prerender-node/blob/master/index.js
(even if you don't like prerender, you can use that code as implementation guide)
Alternatively, here's an implementation example using only nginx:
https://gist.github.com/thoop/8165802

Evernote Resource URLs

I'm writing an application that takes a user's Evernote notes and displays them in a website inline. By its very nature, people accessing the resources attached to a note will not be logged in. I'm looking at the bottom of this page and saw how to pass authentication credentials via POST and get the resource. This is exactly what I need.
My question is how does this work in the real world? If I pass authentication tokens to the Javascript client (not secure in the first place), I can't get the resource because of Access-Control-Allow-Origin restrictions. The only other way I can think of doing this is saving all of the resources to my server and serving them from there, but that's not ideal (Google App Engine).
Ideas?
Yea, Evernote does not support CORS yet. You can do it in a chrome extension or get it on the server side.

Creating and serving temporary HTML files in Azure

We have an application that we would like to migrate to Azure for scale. There is one place that concerns me before starting however:
We have a web page that the user is directed to. The code behind on the page goes out to the database and generates an HTML report. The new HTML document is placed in a temporary file along with a bunch of charts and other images. The user is then redirected to this new page.
In Azure, we can never be sure that the user is going to be directed to the same machine for multiple reasons: the Azure load balancer may push the user out to a different machine based on capacity, or the machine may be deprovisioned because of a problem, or whatever.
Because these are only temporary files that get created and deleted very frequently I would optimally like to just point my application's temp directory to some kind of shared drive that all the web roles have read/write access to, and then be able to map a URL to this shared drive. Is that possible? or is this going to be more complicated than I would like?
I can still have every instance write to its own local temp directory as well. It only takes a second or two to feed them so I'm ok with taking the risk of whether that instance goes down during that microsecond. The question in this regard is whether the redirect to the temp HTML file is going to use http 1.1 and maintain the connection to that specific instance.
thanks,
jasen
There are 2 things you might want to look at:
Use Windows Azure Web Sites which supports some kind of distributed filesystem (based on blob storage). So files you store "locally" in your Windows Azure Web Site will be available from each server hosting that Web Site (if you use multiple instances).
Serve the files from Blob Storage. So instead of saving the HTML files locally on each instance (or trying to make users stick to a specific instance), simply store them in Blob Storage and redirect your use there.
Good stuff from #Sandrino. A few more ideas:
Store the resulting html in in-role cache (which can be collocated in your web role instances), and serve the html from cache (shared across all instances)
Take advantage of CDN. You can map a "CDN" folder to the actual edge-cache. So you generate the html in code once, and then it's cached until TTL expiry, when you must generate the content again.
I think azure blob is best place to to store your html files which can be accessed by multiple instances. You can redirect user to that blob content or you can write custom page to render content from blob.

Resources