I'm using AWS EC2 to run a database that supports search capabilities - similar to Elasticsearch. The database is only running in a single AWS region due to budgetary constraints.
The database is also running inside of a private subnet in a VPC. Currently there are no inbound or outbound connections that it can make.
I need to allow access to the database so that only my serverless functions can connect to it via HTTP. Users should not be allowed to access it directly from the client-side. Using Lambda is possible but is far from ideal due to long cold start times. Users will expect search results to appear very quickly, especially when the page first loads. So something else is required.
The plan is to replace Lambda with Cloudflare Workers. With faster start times and closer distance to end users all over the world, connecting to the database this way would probably give me the speed I need while still offering all the benefits of a serverless approach. However, I'm not sure how I can configure my VPC security group to allow connections only from a specific worker.
I know that my workers all have unique domains such as https://unique-example.myworkerdomain.com and they remain the same over time. So is there a way I can securly allow inbound connections from this domain while blocking everything else? Can/should this be done through configuring security groups, internet gateway, IAM role, something else entirely?
Thank you for any help and advice
There are a couple of options.
ECS
You can run an ECS cluster in the same VPC as your database, and run Fargate tasks, which have sub-second start times (maybe 100ms or less?). And you can run ECS tasks on hot cluster instances (but you then pay for them all the time), but perhaps a scale to/from zero approach with ECS would allow you to manage cost without compromising on most of user requests (the first request after a scale-to-zero event would get 100ms+ latency, but subsequent requests would get similar). Lambda actually does something similar to this under the hood, but with much more aggressive scale-down timelines. This doesn't restrict from a specific domain, but may solve your issue.
Self-Managed Proxy
Depending on how your database is accessed, you might be able to have a reverse proxy such as Nginx in a public subnet doing request validation to limit access to the database. This could control access by any request headers, but I'd recommend doing TLS client validation to ensure that only your functions can access the database through the proxy, and it might be possible to validate the domain this way (by limiting the trusted CA to an intermediate CA that only signs for that domain, alternatively, I think Nginx can allow a connection depending on traits of the client cert matches regexes such as domain name).
Route Through Your Corporate Network
Using a VPN, you can originate the function from within your network or somehow filter the request, then the database could still be in a private subnet with connectivity allowed from the corporate network through the VPN.
Use AWS WAF
You make a public ALB pointing at your database, and set up AWS WAF to block all requests that don't contain a specific header (such as an API key). Note: you may have to also set up Cloudfront, I forget off the top of my head whether you can apply WAF directly to an ELB or not. Also note: I don't particularly advise this, as I don't think WAF was designed with sensitive strings in the rules, so you may have to think about who has describerule / describewebacl permissions on WAF, also these rules may end up in logs because AWS doesn't expect the rules to be sensitive. But it might be possible for WAF to filter on something you find viable. I'm pretty sure you can filter on HTTP headers, but unless those headers are secret, anyone can connect by submitting a request with those headers. I don't think WAF can do client domain validation.
Related
I'm trying to follow this tutorial here, but I can't complete the verification step (#4). My domain provider doesn't allow me to add a DNS record for the type AAAA. I tried contacting my domain provider but they say it's not supported. Is there another work around I could do? Should I try using another cloud hosting service like Azure?
You can use the features and capabilities that Cloud DNS offers. No need for switching Cloud hosting services.
Cloud DNS is a high-performance, resilient, global Domain Name System (DNS) service that publishes your domain names to the global DNS in a cost-effective way.
Migrate to Cloud DNS an existing DNS domain from another DNS provider.
Then, Managing Records will make it easy for you to add and remove a record. This is done by using a transaction that specifies the operations you want to perform. A transaction supports one or more record changes that are propagated together.
Update
I would also check out Google Domains, which is a fairly new service (still in Beta) and allows you to register your domain name and works like a charm.
I'm trying to get a proof of concept going for a multi-tenancy containerized ASP.NET MVC application in Service Fabric. The idea is that each customer would get 1+ instances of the application spread across the cluster. One thing I'm having trouble getting mapped out is routing.
Each app would be partitioned similar to this SO answer. The plan so far is to have an external load balancer route each request to the SF Reverse Proxy service.
So for instance:
tenant1.myapp.com would get routed to the reverse proxy at <SF cluster node>:19081/myapp/tenant1 (19081 is the default port for SF Reverse Proxy), tenant2.myapp.com -> <SF Cluster Node>:19081/myapp/tenant2, etc and then the proxy would route it to the correct node:port where an instance of the application is listening.
Since each application has to be mapped to a different port, the plan is for SF to dynamically assign a port on creation of each app. This doesn't seem entirely scaleable since we could theoretically hit a port limit (~65k).
My questions then are, is this a valid/suggested approach? Are there better approaches? Are there things I'm missing/overlooking? I'm new to SF so any help/insight would be appreciated!
I don't think the Ephemeral Port Limit will be an issue for you, is likely that you will consume all server resources (CPU + Memory) even before you consume half of these ports.
To do what you need is possible, but it will require you to create a script or an application that will be responsible to create and manage configuration for the service instances deployed.
I would not use the built-in reverse proxy, it is very limited and for what you want will just add extra configuration with no benefit.
At moment I see traefik as the most suitable solution. Traefik enables you to route specific domains to specific services, and it is exactly what you want.
Because you will use multiple domains, it will require a dynamic configuration that is not provided out of the box, this is why I suggested you to create a separate application to deploy these instances. A very high level steps would be:
You define your service with the traefik default rules as shown here
From your application manager, you deploy a new named service of this service for the new tenant
After the instance is deployed you configure it to listen in a specific domain, setting the rule traefik.frontend.rule=Host:tenant1.myapp.com to the correct tenant name
You might have to add some extra configurations, but this will lead you to the right path.
Regarding the cluster architecture, you could do it in many ways, for starting, I would recommend you keep it simple, one FrontEnd node type containing the traefik services and another BackEnd node type for your services, from there you can decide how to plan the cluster properly, there is already many SO answers on how to define the cluster.
Please see more info on the following links:
https://blog.techfabric.io/using-traefik-reverse-proxy-for-securing-microservices-on-azure-service-fabric/
https://docs.traefik.io/configuration/backends/servicefabric/
Assuming you don't need an instance on every node, you can have up to (nodecount * 65K) services, which would make it scalable again.
Have a look at Azure API management and Traefik, which have some SF integration options. This works a lot nicer than the limited built-in reverse proxy. For example, they offer routing rules.
Is there an elegant way to prevent requests from referrers outside the application? Looking at the app.yaml documentation, it doesn't seem like this is a in-box functionality but it seems like it'd be so preferred/common that it has to be hidden somewhere rather than necessarily having to reimplement it manually for every application.
It is built in.
In app.yaml you can specify login: admin for a handler and it would accept requests just from admin or from AppEngine (e.g. cron, taskqueues, and pribably urlfetch of the app itself - the last one not 100% sure).
See docs: https://cloud.google.com/appengine/docs/python/config/appref#handlers_element
Also you can check HTTP_HEADERS like referer, IP, user agent.
Also you can issue a token and pass/check it with every request.
There are probably a few issues being conflated here.
CORS - a browser enforced security measure to prevent pages maliciously or otherwise sending data to non-origin servers. Servers cannot enforce this, only permit it. Permitting this is an application level concern (i.e. Not built in to GAE)
XSRF - a server enforced security measure to stop authenticated users having their account abused by malicious client code. This is an application level concern (i.e. Not built in to GAE)
Authentication - identifying a user or client by some set of permissions. There is some support for this with GAE (cloud endpoints, provided identity service, requiring admin login), but it typically would also be an application level concern. In the case of authorizing a client (as opposed to a user) there is no built in support.
Authorization - different sets of access based on roles/permissions. Not built in.
Other solutions:
Using Host or Origin header - trivially outfoxed by someone implementing a curl request, or any application more sophisticated
API token - fine for server to server over https but trivially compromised when used on a published client (like a web page)
Your best bet is to leverage your framework and have user accounts.
If you don't want to do this, something like XSRF (token in a header and cookie) would be enough in the general case to ensure that a web client was responding to your 'app'. This only works if the client is a web browser though, same as origin/host.
No. There is no built-in logic in GAE for that. Any support would have to exist at the level of your application-specific request routing.
I need to create licensing server for my application.
Application should ping licensing server and if license expired stop working.
How should it be done securely? I haven't found any articles about this.
More exactly, what confuses me is how to prevent attacker to do the following
Look where I make requests (using fiddler e.g.)
Create his own server
Point his PC to that server using etc/host file.
Any best practices about this?
You can do this by enabling HTTPS on your server. Your application will need to verify the HTTPS certificate to ensure the remote host is not a fake licensing server.
This article describes the attack you mention, and how is it possible to avoid it using HTTPS.
Here's a useful sample :
Defeating Active Attackers
Verifying the server’s authenticity is key to defeating active attackers. Fortunately, TLS has this covered as well. As you recall, HTTPS is really just HTTP running over TLS. When HTTPS is implemented correctly, here is what happens to active attackers.
Because the legitimate server’s Certificate Authority (CA) verifies
ownership of the domain (yourwebsite.com), an active attacker cannot
fake the certificate. Encryption prevents the attacker from reading or
modifying any intercepted data. In short, the entire CIA triad is
satisfied and both passive and active attackers are defeated.
In your case, the roles are slightly different : the user is your application, while the potential attacker is the application user who doesn't want to pay for a license. ;)
I was thinking to make use of Elastic Search and want to know all the possible loopholes in security for Elastic Search and how to take care of them. Also, what effect will this have in performance of Elastic Search?
Elasticsearch by default is not secure, means anybody who knows your ip can access it. But there are lot of ways to secure it.
In configuration you can set the value of network.bind_host to localhost or your intranet ip so that is is accessible only from that. For more details check out the doc.
You can simply restrict the port access(default is 9200) using iptables.
You can use nginx as a proxy so that you can have all the goodness and configurability of nginx. Read about it at playing http tricks with nginx.
Elastic also has a commercial security product called shield.
There are few other security plugins available on the net also. Though elasticsearch by default is not secured it is easy to setup a security around it.
Of all I personally prefers the nginx proxy as it is very easy to setup and gives me an added advantage of logging all request to elasticsearch via nginx access logs.
Lastly, the security additions will have no/negligible performance impact.
ElasticSearch is insecure by default, however I'd really hesitate to say thats any different than any other service. You shouldn't have your database connection public facing, right? You should really consider treating it like any other services that you wouldn't want publicly accessible. Elasticsearch does provide https and basic auth. So it has the capability to be secure as long as you make it so, but the same can be said about many services you deploy.