Cloud Storage Signed URLs By Proxy - google-app-engine

I'm trying to support large file uploads for a Cloud Run (and App Engine) project. There are some constraints that prevent the usual workarounds from working:
The clients are .NET 4.0 applications which means HTTP2 is not available (which gets you around at least Cloud Run's 32MB request size limit)
Legacy clients are not upgradable so chunked uploads are not available for them, and backwards compatibility is a requirement
Signed URLs to cloud storage is the current solution and work well, however some % of clients do not work at all because the customer's IT has blocked googleapis (but not our company domain)
Asking the customer's IT to unblock googleapis is difficult/non-starter
This leads me to the conclusion that I should setup a forward proxy that allows Signed URLs to get around IT restrictions through our GCP project/company domain. I would accomplish this in Compute Engine with an instance running nginx or squid or something and then have a load balancer direct URLs of a certain pattern to the forward proxy which will rewrite the URL to the correct cloud storage signed URL and forward the request.
However, this seems like a bit of a clunky solution. Is there something simpler native to GCP that accomplish what I'm trying to do?

I was able to proxy cloud storage signed URLs using nginx:
events {
worker_connections 1024;
}
http {
client_max_body_size 500M;
server {
listen 80;
listen [::]:80;
server_name mydomain;
location /storagepxy/ {
proxy_pass https://storage.googleapis.com/;
}
}
}
I then setup a GCP load balancer to direct any requests starting with /storagepxy/* to a compute engine instance group running nginx with the above config.
Thus, I could read/write to cloud storage using requests of the form:
GET mydomain/storagepxy/[cloud storage signeduri]
PUT mydomain/storagepxy/[cloud storage signeduri]
So if you had a signed URL like:
https://storage.googleapis.com/example-bucket/cat.jpeg?X-Goog-Algorithm=
GOOG4-RSA-SHA256&X-Goog-Credential=example%40example-project.iam.gserviceaccount
.com%2F20181026%2Fus-central1%2Fstorage%2Fgoog4_request&X-Goog-Date=20181026T18
1309Z&X-Goog-Expires=900&X-Goog-SignedHeaders=host&X-Goog-Signature=247a2aa45f16
9edf4d187d54e7cc46e4731b1e6273242c4f4c39a1d2507a0e58706e25e3a85a7dbb891d62afa849
6def8e260c1db863d9ace85ff0a184b894b117fe46d1225c82f2aa19efd52cf21d3e2022b3b868dc
c1aca2741951ed5bf3bb25a34f5e9316a2841e8ff4c530b22ceaa1c5ce09c7cbb5732631510c2058
0e61723f5594de3aea497f195456a2ff2bdd0d13bad47289d8611b6f9cfeef0c46c91a455b94e90a
66924f722292d21e24d31dcfb38ce0c0f353ffa5a9756fc2a9f2b40bc2113206a81e324fc4fd6823
a29163fa845c8ae7eca1fcf6e5bb48b3200983c56c5ca81fffb151cca7402beddfc4a76b13344703
2ea7abedc098d2eb14a7
You could proxy it via:
https://mydomain/storagepxy/example-bucket/cat.jpeg?X-Goog-Algorithm=
GOOG4-RSA-SHA256&X-Goog-Credential=example%40example-project.iam.gserviceaccount
.com%2F20181026%2Fus-central1%2Fstorage%2Fgoog4_request&X-Goog-Date=20181026T18
1309Z&X-Goog-Expires=900&X-Goog-SignedHeaders=host&X-Goog-Signature=247a2aa45f16
9edf4d187d54e7cc46e4731b1e6273242c4f4c39a1d2507a0e58706e25e3a85a7dbb891d62afa849
6def8e260c1db863d9ace85ff0a184b894b117fe46d1225c82f2aa19efd52cf21d3e2022b3b868dc
c1aca2741951ed5bf3bb25a34f5e9316a2841e8ff4c530b22ceaa1c5ce09c7cbb5732631510c2058
0e61723f5594de3aea497f195456a2ff2bdd0d13bad47289d8611b6f9cfeef0c46c91a455b94e90a
66924f722292d21e24d31dcfb38ce0c0f353ffa5a9756fc2a9f2b40bc2113206a81e324fc4fd6823
a29163fa845c8ae7eca1fcf6e5bb48b3200983c56c5ca81fffb151cca7402beddfc4a76b13344703
2ea7abedc098d2eb14a7
Note: If your bucket path contains URL-encoded characters such as colons, you'll need a slightly more complicated nginx config:
# This is a simple nginx configuration file that will proxy URLs of the form:
# https://mydomain/storagepxy/[signed uri]
# to
# https://storage.googleapis.com/[signed uri]
#
# For use in GCP, you'll likely need to create an instance group in compute engine running nginx with this config
# and then hook up a load balancer to forward requests starting with /storagepxy to it
worker_processes auto; # Auto should spawn 1 worker per core
events {}
http {
client_max_body_size 500M;
server {
listen 80; # IPv4
listen [::]:80; # IPv6
server_name mydomain;
location /storagepxy/ {
# To resolve storage.googleapis.com
resolver 8.8.8.8;
# We have to do it this way in case filenames have URL-encoded characters in them
# See: https://stackoverflow.com/a/37584637
# Also note, if the URL does not match the rewrite rules then return 400
rewrite ^ $request_uri;
rewrite ^/storagepxy/(.*) $1 break;
return 400;
proxy_pass https://storage.googleapis.com/$uri;
}
}
}

Related

Nginx is redirecting subdomain to main domain

I have two domains,
zerp.io (ssl installed)
app.zerp.io (only http)
in zerp.io (main domain) a wordpress website is hosted and is working fine. I am trying to deploy a React app on app.zerp.io using nginx. I deleted the default file and created new file app.zerp.io at /etc/nginx/sites-available/ I also created same file at /etc/nginx/sites-enabled/ and created a symlink between them. I checked the DNS entry, app.zerp.io and www.app.zerp.io is pointing to the public Ip of the correct server where React App resides.
Here's my /etc/nginx/sites-available/app.zerp.io file
server {
listen 80;
index index.html index.htm index.nginx-debian.html;
server_name www.app.zerp.io app.zerp.io;
location / {
proxy_pass localhost:3000;
proxy_ser_header host $host;
}
}
The problem is, whenever I try to reach http://app.zerp.io through web browser it redirects me to https://zerp.io. Here's what I did so far,
I checked DNS using an online tool, its correctly pointing to the server
I did not use any 301 redirects in the configuration file as you can see above
when I try curl app.zerp.io from the production server (in Germany), sometimes it gives 200 with correct response and sometimes it gives 301 (moved permanently) crazy isn't it
When I try curl app.zerp.io from my local computer it always give me 301 although I do not have any 301 in my nginx config file
I thought, may be its a cache issue on my chrome, to my surprise no, I cleared the cache and hard reload, I even tried incognito mode with no success, it always redirect me to https://zerp.io
When I try curl app.zerp.io from my local computer using a VPS it correctly opens the website app.zerp.io.
I do not have any ssl certificate so there are not redirects from http to https in http://app.zerp.io
Its been two days, Its making me crazy, I am assuming it has something to do with DNS resolution. Can some please help me out

Deploying Client and Server to the same VM

I have an application that has a React frontend and a Python Flask backend. The frontend communicates with the server to perform specific operations and the server api should only be used by the client.
I have deployed the whole application (Client and Server) to an Ubuntu virtual machine. The machine only has specific ports open to the public (5000, 443, 22). I have setup Nginx configuration and the frontend can be access from my browser via http://<ip:address>:5000. The server is running locally on a different port, 4000, which is not accessible to the public as designed.
The problem is when I access the client app and I navigate to the pages that communicate with the server via http://127.0.0.1:4000 from the react app, I get an error saying connection was refused.
GET http://127.0.0.1:4000/ net::ERR_CONNECTION_REFUSED on my browser.
When I ssh into the vm and run the same command through curl curl http://127.0.0.1:4000/, I get a response and everything works fine.
Is there a way I can deploy the server in the same vm such that when I access the client React App from my browser, the React App can access the server without problems?
So after tinkering with this, I found a solution using Nginx. Summary is you run the server locally and use a different port say 4000 (not exposed to public), then expose your react app on the exposed port in this case 5000.
Then use a proxy in your Nginx config that redirects any call starting with api to the local host server running. See config below
server {
#Exposed port and servername ipaddress
listen 5000;
server_name 1.2.3.4 mydomain.com;
#SSL
ssl on;
ssl_certificate /etc/nginx/ssl/nginx.crt;
ssl_certificate_key /etc/nginx/ssl/nginx.key;
ssl_protocols TLSv1.2;
#Link to the react build
root /var/www/html/build;
index index.html index.htm;
#Error and access logs for debugging
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
location / {
try_files $uri /index.html =404;
}
#Redirect any traffic beginning with /api to the flask server
location /api {
include proxy_params;
proxy_pass http://localhost:4000;
}
}
Now this means you need to have all your server endpoints begin with /api/... and the user can also access the endpoint from the browser via http://<ip:address>:5000/api/endpoint
You can mitigate this by having your client send a token the server and the server will not run any commands without that token/authorization.
I found the solution here and modified it to fit my specific need here
Part two of solution
Other series in the solution can be found Part one of solution and Part three of solution

Can a ReactJS app with a router be hosted on S3 and fronted by an nginx proxy?

I may be twisting things about horribly, but... I was given a ReactJS application that has to be served out to multiple sub-domains, so
a.foo.bar
b.foo.bar
c.foo.bar
...
Each of these should point to a different instance of the application, but I don't want to run npm start for each one - that would be a crazy amount of server resources.
So I went to host these on S3. I have a bucket foo.bar and then directories under that for a b c... and set that bucket up to serve static web sites. So far so good - if I go to https://s3.amazonaws.com/foo.bar/a/ I will get the index page. However most things tend to break from there as there are non-relative links to things like /css/ or /somepath - those break because they aren't smart enough to realize they're being served from /foo.bar/a/. Plus we want a domain slapped on this anyway.
So now I need to map a.foo.bar -> https://s3.amazonaws.com/foo.bar/a/. We aren't hosting our domain with AWS, so I'm not sure if it's possible to front this with CloudFront or similar. Open to a solution along those lines, but I couldn't find it.
Instead, I stood up a simple nginx proxy. I also added in forcing to https and some other things while I had the proxy, something of the form:
server {
listen 443;
server_name foo.bar;
ssl on;
ssl_certificate /etc/pki/tls/certs/server.crt;
ssl_certificate_key /etc/pki/tls/certs/server.key;
ssl_session_timeout 5m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
# Redirect (*).foo.bar to (s3bucket)/(*)
location / {
index index.html index.htm;
set $legit "0";
set $index "";
# First off, we lose the index document functionality of S3 when we
# proxy requests. So we need to add that back on to our rewrites if
# needed. This is a little dangerous, probably should find a better
# way if one exists.
if ($uri ~* "\.foo\.bar$") {
set $index "/index.html";
}
if ($uri ~* "\/$") {
set $index "index.html";
}
# If we're making a request to foo.bar (not a sub-host),
# make the request directly to "production"
if ($host ~* "^foo\.bar") {
set $legit "1";
rewrite /(.*) /foo.bar/production/$1$index break;
}
# Otherwise, take the sub-host from the request and use that for the
# redirect path
if ($host ~* "^(.*?)\.foo\.bar") {
set $legit "1";
set $subhost $1;
rewrite /(.*) /foo.bar/$subhost/$1$index break;
}
# Anything else, give them foo.bar
if ($legit = "0") {
return 302 https://foo.bar;
}
# Peform the actual proxy forward
proxy_pass https://s3.amazonaws.com/;
proxy_set_header Host s3.amazonaws.com;
proxy_set_header Referer https://s3.amazonaws.com;
proxy_set_header User-Agent $http_user_agent;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Accept-Encoding "";
proxy_set_header Accept-Language $http_accept_language;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
sub_filter google.com example.com;
sub_filter_once off;
}
}
This works - I go to a.foo.bar, and I get the index page I expect, and clicking around works. However, part of the application also does an OAuth style login, and expects the browser to be redirected back to the page at /reentry?token=foo... The problem is that path only exists as a route in the React app, and that app isn't loaded by a static web server like S3, so you just get a 404 (or 403 because I don't have an error page defined or forwarded yet).
So.... All that for the question...
Can I serve a ReactJS application from a dumb/static server like S3, and have it understand callbacks to it's routes? Keep in mind that the index/error directives in S3 seem to be discarded when fronted with a proxy the way I have above.
OK, there was a lot in my original question, but the core of it really came down to: as a non-UI person, how do I make an OAuth workflow work with a React app? The callback URL in this case is a route, which doesn't exist if you unload the index.html page. If you're going directly against S3, this is solved by directing all errors to index.html, which reloads the routes and the callback works.
When fronted by nginx however, we lose this error->index.html routing. Fortunately, it's a pretty simple thing to add back:
location / {
proxy_intercept_errors on;
error_page 400 403 404 500 =200 /index.html;
Probably don't need all of those status codes - for S3, the big thing is the 403. When you request a page that doesn't exist, it will treat it as though you're trying to browse the bucket, and give you back a 403 forbidden rather than a 404 not found or something like that. So in this case a response from S3 that results in a 403 will get redirected to /index.html, which will recall the routes loaded there and the callback to /callback?token=... will work.
You can use Route53 to buy domain names and then point them toward your S3 bucket and you can do this with as many domains as you like.
You don't strictly speaking need to touch CloudFront but it's recommended as it is a CDN solution which is better for the user experience.
When deploying applications to S3, all you need to keep in mind is that the code you deploy to it is going to run 100% on your user's browser. So no server stuff.

How To Avoid Mixed Content with Docker Apps

I am running a Django based web application inside a set of Docker containers and I'm trying to include both a REST API (using django-REST-framework) as well as the ReactJS app that consumes it. All my other apps are served over HTTPS but I am running into Mixed Active Content when it comes to the React app hitting the REST API inside the Docker network. The React App is being hosted within my NGINX container and served up as a static site.
Here's the relevant config for my Nginx container:
# SSL Website
server {
listen 443 http2 ssl;
listen [::]:443 http2 ssl;
server_name *.domain.com;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers EECDH+CHACHA20:EECDH+AES128:RSA+AES128:EECDH+AES256:RSA+AES256:EECDH+3DES:RSA+3DES:!MD5;
ssl_prefer_server_ciphers on;
ssl_certificate /etc/nginx/ssl/my_cert.crt;
ssl_certificate_key /etc/nginx/ssl/my_key.key;
ssl_stapling on;
ssl_stapling_verify on;
access_log /home/logs/error.log;
error_log /home/logs/access.log;
upstream django {
server web:9000;
}
location /
{
include uwsgi_params;
# Proxy settings
proxy_pass http://django;
proxy_http_version 1.1;
proxy_buffering off;
proxy_set_header Host $http_host;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# REACT APPLICATION
location /faqs {
autoindex on;
sendfile on;
alias /usr/share/nginx/html/faqs;
}
}
The during development the React app was hitting my REST API from outside the network so resources calls used https like so:
axios.get(https://myapp.domain.com/api/)
and everything went relatively smoothly, barring the occasional CORS error.
However, now that both the React and the API are running inside the Docker network NGINX is not involved in the communication between containers and the routes are like so:
axios.get(http://web:9000/api)
This gives me the aggravating Mixed Active Content Error.
I've seen multiple questions similar to this but most are either not using Docker containers or use some NGINX directives I've already got in my config file. Given the popularity of Docker for these kind of loosely coupled applications I would imagine solutions abound for this kind of problem. Sadly I have not managed to come across any and as such, any suggestions would be greatly appreciated.
Since your application includes both an API and a web client from the same end point, you have a "gateway" in nginx that routes all requests to either end point. So far, common practice (although you are missing a load balancer, but that's a different discussion)
All requests to your API should be to https. You should also be serving your static site over https with the same certificate from the same domain. If this isn't the case - there is your problem.
Furthermore, all routes and urls inside your react application should be relative. That means that the react app doesn't need to know what your domain is. Neither should your API ideally although that is sometimes harder to do.
your axios call, given that the react app is served from the same domain over https, should be
axios.get(/api)

Let's Encrypt / Certbot in Google App Engine -> can't check challenge -> Forbidden 403

Using Google App Engine and Let's Encrypt or Certbot, I'm trying to issue a certificate to my web app, but when the challenge is to be tested, the file hosted in /.well-known/acme-challenge/ can't be acessed because (apparently of nginx configuration that prohibits access to dot paths), in other words, it gets a 403 - Forbidden page instead of the key.
I've already tried to change nginx.conf with this:
location ^~ /.well-known/ {
allow all;
}
Restarted nginx service, but still, I can't get it to work.
Did you try using an alias?
location ^~ /.well-known {
allow all;
auth_basic off;
alias /path/to/.well-known/;
}

Resources