What would be the least expensive and PaaS-agnostic ways to protect and separate sensitive data in a multi tenant application which is using a shared database?
Some background info and more specific questions:
We are a small startup company. We have successfully launched an intranet web application project for a customer, and now we feel ready to offer a cloud solution for similar kinds of customers.
As we are on Microsoft BizSpark program, we are investigating Azure App Services. We might migrate later to Cloud Services, but for now it seems that App Services will be enough. Still, we don't want to tie ourselves to Azure too much in case we would want to move to other SaaS provider later.
Our application will store some sensitive information which should be protected. Separate encrypted (Azure provides transparent encryption) databases for each tenant would offer the most security, but we have no budget for such solution and it would be hard to manage automatically.
Currently our plan is that we will offer our customers to register a subdomain under our wildcard domain and then internally we map the subdomain to tenant ID which will be used in each table.
This seems to be the most cost-effective solution for a startup company because there are no additional management for every additional tenant, and registration can be fully automatic. I understand that we'll have to be super careful to enforce the use of tenant ID in every SQL request (using SQL views and stored procs with built-in tenant ID query will help), but that's obviously is not enough. We need some mechanism to protect each tenant's sensitive data with some encryption key.
And then some questions come:
should we use a single encryption key for all the sensitive data for all tenants? or should we have a separate key for each tenant?
if we go for separate keys (generated randomly at the moment of registration, so the key won't be known even for us), then who and how should store and protect the tenant encryption key? Should we give the key to the tenant and then ask his employees to provide the key in addition to each employee's login name and password every time they log in through web browser?
what approach would work best considering that we might later need sharding or "elastic" database scaling as some PaaS providers call it, and that we might move from Azure and Microsoft SQL Server to something else?
If someone has experience with multi-tenancy & database protection, I'd really appreciate some advice, some do's and dont's and possible caveats. I have read some articles about the topic but they often are too specific to PaaS platforms or do not explain possible consequences and difficulties, but that knowledge comes only from day-to-day experience and trials&errors.
Adding some answers for completeness sake (2yrs later):
should we use a single encryption key for all the sensitive data for all tenants? or should we have a separate key for each tenant?
You should have a separate key for each tenant. Not sure how Azure does it but AWS has KMS for this usecase
if we go for separate keys (generated randomly at the moment of registration, so the key won't be known even for us), then who and how should store and protect the tenant encryption key? Should we give the key to the tenant and then ask his employees to provide the key in addition to each employee's login name and password every time they log in through web browser?
Use KMS (or something similar on other clouds, homegrown solution like Square's keywhiz). You don't need to give the key to tenant. All you care about is - if the user has authenticated using their password (or SSO), based on their permissions, they have access to some resources.
what approach would work best considering that we might later need sharding or "elastic" database scaling as some PaaS providers call it, and that we might move from Azure and Microsoft SQL Server to something else?
You need a tenantId which will help with data isolation. Now the actual data linked to tenantId can be encrypted based on a key stored in KMS.
Related
We are a SaaS product and we would like to be able have per-user data exports that will be used with various analytical (BI) tools like Tableau or PowerBI. Instead of just managing all those exports manually, we thought of using some cloud database such as AWS Redshift (which will be part of our service). But then, it is not clear how is user will access those databases naturally, unless we do some kind of SSO integration with AWS.
So - what is the best practice for exporting data for analytics use in SaaS products?
In this case you can build your security in to your backend API layer.
First you can set up processes to load your data to Redshift, then make sure that only your backend API server/cluster has access to redshift (e.g. through a vpc with no external ip access to redshift)
Now you have your data, you can validate your user as usual through your backend service, then when a user requests a download through the backend API, the backend can create a query to extract from redshift only the correct data based upon the users security role. In order to make this possible you may need to build some kind of security column into your redshift data model.
I am assuming getting data to redshift is not a problem.
What you are looking for, if I understand correctly is a OEM solutions.
The problem is how does one mimic the security model you have in place for your SaaS offering.
That depends on how complex is your security model.
If it is as simple as just authenticate the user and he has access to all tenant data or the data can be easily filtered for user. Things are simple for you. Trusted authentication will allow you to authenticate that user and user filtering will allow you to show him all that he has access to.
But here is the kicker, if your security is really complex , then it can become really difficult to mimic it within these products.
Here for integrating tableau this link will help:-
https://tableau.github.io/embedding-playbook/#
Power BI, this product am not a fan off. I tried to embed a view in one my applications and data refresh was a big issue.
Its almost like they want you to be a azure shop for real time reporting.( I like GCP more )
If you create the api's and populate datasets then they have crazy restrictions like 1MB/sec etc.
On the other instances datasets can be refreshed only 8 times.
I gave up on them.
Very recently I got a call from Sisense and they seemed promising as well from a OEM perspective. You might was to try them.
I’m currently working with a customer where we are deploying a number of Azure PaaS services via ARM templates.
The deployment runs in VSTS in service principal context.
As part of the deployment we need to specify an application ID and key for ACS.
So far we have made do with a key which was manually generated and passed to us.
We would like however to be able to generate new keys on the fly, both when applications change and to be able to do basic key management to avoid having keys expire on us.
Looking at the options though it seems as if the only way to achieve this is by setting Read and Write to all Applications in Settings/API Access/Required Permissions/AAD for the requesting Application/SSPI.
Is that really the only way or am I going about this in the wrong manner?
We are beginning to implement Sitecore for our website at my company. We are in the midst of the discovery phase and evaluating Active Directory module. We have 40-50 users who will be using Sitecore and over a 100 users who will be using some customized applications on top of Sitecore.
The consultancy we hired are asking us to not go with Active Directory since only 40-50 users will be using it. I on the other hand think that using the Active Directory module would be useful in the long run.
Do you guys have any input? What is the recommended practice?
Thanks
It really comes down to how you want to govern your CMS users. The AD module bubbles up those users into the CMS as users and thus exposes them for login. You can even do the same with groups/units. The advantage here is that if a new person joins your org, if you add them to the OU or assign then to a group that has Sitecore access then they gain access to Sitecore.
On the flip side, if you want Sitecore to be it's own entity with its own user profiles and logins, it can do that in a silo without the AD connection
To the CMS, there is no difference where the users are actually authenticated because the provider you select is low level. So the ultimate decision would be more of a governance / IT / process decision as there's really no functional difference.
My recommendation for you is to come up with scenarios or use cases and think through each in both scenarios. Eg you hire 10 people that need author access. With the AD module you just assign them to the OU or group that inherits te author roles in Sitecore and you're done.
I have implemented the Active Directory module a few times now and it works really well when you want to have users to be able to SSO into the authoring interface and manage your security access within Active Directory. You can also use it well for doing end-user SSO if you are building something like an Intranet application on Sitecore.
From a security management perspective, it becomes easier for the organization and also allows you to not worry about having to duplicate users between different environments (Dev, Test, Prod).
That being said, there is a performance overhead with using the Active Directory module that is not present if you use only the native Sitecore security provider. With your number of users, you probably won't see any difference, but with extremely large AD directories with complex group memberships you may run into performance issues if you are using indirect membership (i.e. groups within groups).
An example scenario:
Content item in Sitecore is secured to the role MyDomain\SuperAuthor
User A is directly a member of MyDomain\SuperAuthor
User B is a member of MyDomain\SuperUser
MyDomain\SuperUser group is a member of MyDomain\SuperAuthor
If you use the Sitecore security provider, resolving User B's access is very efficient. Sitecore is able to check the indirect membership quickly using the roles within the system.
If you use the Active Directory module, the indirect membership is disabled by default. Only User A would have access. If you change the configuration setting to enable indirect membership, the module will then allow User B to have access, however you will begin to see a slower performance for that scenario.
As I mentioned before, however, if Active Directory is not very complex as to what is being pulled into Sitecore, you should be fine and probably won't notice these performance impacts.
I don't think number of users should be the sole reason to decide on whether or not to integrate AD nor should it be because you may or may not need it in the long run. I would say integrate with AD because of its most obvious benefits
Single user name and password
Better security
Ease of maintenance
Although number of users becomes and important deciding factor when you need to create several thousand users and setup authorization for them.
The most common reason users are manually created and maintained in sitecore is when you need to create a handful of authors and approver accounts mostly for the marketing team. But if you foresee implementing membership or need to provide access and authorization based on an existing user and group policy then go for AD integration.
I create a CMS from scratch and decided to use CouchDB as my database solution. For my CMS I need various accounts and of course different user roles (admin, author, unregistered user, etc.).
First I thought I would program authorization within my CMS myself, but CouchDB has stuff like this build in, so I want to ask:
What is the best practice creating a multiuser app with CouchDB?
Create only one admin for CouchDB and manage restrictions, roles and accounts by yourself?
Use build-in functionality of CouchDB for all this? (Say create a CouchDB admin user for every admin of the CMS?)
What if I want to add other 3rd-party authorization later? Say I want users to login via Twitter/Facebook/Google?
Greetings,
Pipo
The critical question is whether you want to expose CouchDB to the public or not.
If you want to build your CMS as a classical 3-tier architecture where CouchDB is exclusively accessed from a privileged scripting layer, e.g. PHP, then I would recommend you to roll your own authorization system. This will give you better control over the authorization logic. Particularly, you can realize document based read access control (not available in the CouchDB security system).
If instead you want to expose CouchDB to the public, things are different. You cannot actually write server side logic (except for separate asynchronous listeners via the changes feed) so you will have to use CouchDB's built in authentication/authorization system. That limits you to read access controlled on a database level (not document level!). Write access can be controlled with validation functions. CouchDB admins should not be equivalent to application admins as a CouchDB admin is rather comparable to a server admin in a traditional setting. A database admin in CouchDB would be a better fit (can change design documents and therefore make modifications to the CMS installation like adding plugins). All other users with write access can be realized as database members.
I would prefer the second approach, because this will give you the possibility to leverage all the nice features of CouchDB like replication and the changes feed. However, you will have to do some filtered replication between databases with different members if you need fine grained read access control.
If you want to use other authentication mechanisms than those offered by CouchDB, you will eventually have to modify the installation (which can be an issue if you want to use a hosted CouchDB). For a facebook plugin see e.g. https://github.com/ocastalabs/CouchDB-Facebook-Authentication.
Imagine you're writing a web app that will have 1 million users (they all grow that big, right!)
How would you handle user accounts? I can imagine a few scenarios:
Roll your own (database tables, salted/hashed passwords stored in a user profile table)
If written with ASP.NET, use the login/role provider (which falls back to the database)
Use Active Directory if in a Windows environment
Use some other LDAP server
A 3rd party provider like OpenID or .NET Passport
Stability and scalability are of course important.
I guess this is really a question of whether Active Directory and other LDAP servers scale well and easily. What do Facebook, Twitter and Gmail use as their backend account provider?
What got me thinking about this is the Google App Engine. Really cool looking. But users would need to get a Google Account if I used the built-in authentication stuff. Or with #5 above, users would need to go get an OpenID. I'm trying to make it so they can just do a simple sign up with my site without needing to visit other sites -- for the non-geeks of the world :)
I would ask someone who had actually worked on a system which caters for that many users.
I'd find out about other systems like this, and look at case studies that have been written about them. (Ask Microsoft, Oracle, IBM etc.).
But, for usability you either need to implement a single sign on solution, so users don't need to know their login details. (Perfect for the corporate world.)
or
You have to go with what users know, which is an email address/username, and password.
OpenID or similiar systems are horrible for non technical users.
(Note, anyone looking at this is a technical user.).
OpenID.
If you must give the users a choice to create an account on your site, become an OP.