Handle multiple customers with their own database in Django - database

I have to develop a Django webapp which can handle several customers with their own databases.
Each customer should manage their different users and groups, with associated permissions and a set of Django's apps.
I need separate databases for security and scaling reasons. Moreover, each base size may exceed 200MB. Besides these dedicated databases, I need a "common" database to store data common to every customer (and must not be replicated to every customer database).
The fact is that I don't know how to handle the multiple customers databases (and keeping Django's auth capabilities for each customer independently : user/group/perms).
I found django-constance (https://github.com/comoga/django-constance) which allow to change settings on the fly, and I have in mind to auth with a customer identifier + username ; this way I may load the right DB with the customer identifier and auth on that one with the username.
Well, this solution does not seem to me so good, and I would appreciate if anybody has a better idea or if someone has already encountered this problem and found a workaround...
I have not found similar problems on the net, and that's quite annoying because that does not seem to me as unusual...
Thanks a lot for the time spent on this.

Keep It Simple -- leverage what Apache (or nginx or whatever) offers you.
Use mod_wsgi.
Use Apache (or nginx or whatever) to split the customer's URL's from each other
/cust1/path/to/resource
/cust2/path/to/resource
If you use Apache (or nginx or whatever) to split these things up, then each top-level path element can direct the request to a specific mod_wsgi instance
Each mod_wsgi instance has it's own settings.py that has the customer's specific database credentials.
Each customer settings.py should probably start with from master_settings import * so that you can have a "for all customers" settings which is extended by each specific customer.

Related

Designing databases and applications for hosted / cloud solutions

Are there any resources available that can guide someone on how to 'think' about the various components of a hosted / cloud solution before going ahead and starting to make a hosted application? If that made no sense, what I mean to ask is are there any guidance books/websites on what things need to be considered when making a cloud application?
I am attempting to make a hosted CRM-style software application that will serve many hundreds of customers. The application is powered by a SQL server database with many tables and a ColdFusion, HTML5, CSS, Javascript front-end. If I was installing this application and its components at each client site, then each installation is unique to that customer. But somehow I have to replicate this uniqueness in the cloud which is baffling me.
Only two things have come to mind so far:
The need for a unique database per customer in SQL server
The need to change DB connection strings per customer in the web application
My thought process has come to a block when I am trying to envisage how to design the application to serve so many different customers. Even though the application that all customers use will is the same (same DB tables, same front-end), the data that they store and retrieve will be specific to them. So I was thinking that surely each customer needs a separate database creating for them? Is it feasible to create a replica database for each customer? If I need to update some tables or add a new table, how would I do this for hundreds of different databases?
From the front-end I guess each unique customer log-in would change DB connection strings so that they can only access their database. Other than this I can't think of anything else that needs to change per customer basis.
When a new customer wants to sign up, it needs to be clear to me what I need to create for them to have access to the application. I guess this is ultimately what I need to think of but I'm stuck.
If anyone can suggest some things to think of or if there is a book or website on this kind of thing that someone could point me to I'd really be very thankful.
EDIT:
I was looking at an article about Salesforce.com and it says
"In order to ensure privacy of data for each user and give an effect of each having their own database, the data from different users are securely isolated from one another."
Anyone know how this is achieved or how it may be done?
Found some great information here. It is called multi-tenant database design and seems to be a common topic. Once I get the database designed then the application can sit nicely on top.
https://dba.stackexchange.com/questions/1043/what-problems-will-i-get-creating-a-database-per-customer

Building a web application with multiple database instances or just a single instance

I am currently designing a web application where I will have customers signing up as companies. Each company will have its own set of users. As I am designing this I am wondering which approach would work best. I see sites like fogbugz or basecamp which use subdomains. In cases with subdomains do you have a database instance per sub domain? I'm wondering if it is recommended to have a database instance per company or if I should have some kind of company table and manage the company and user data/credentials all from one database.
Which approach is best? Is there literature on this subject (i.e. any web or book)?
thanks in advance!
You have to weigh up your options, as some of this will be a matter of opinion and might not be feasible for your implementation.
That being said, I'd consider the single database approach, for these reasons:
Maintenance: when running a database per registered 'client', you will very easily reach a situation where any changes or upgrades you make to your app's schema have to be applied to every single database instance. This will get ridiculous, fast.
Convenience: You might want analytics and usage stats, or some way to administrate all these databases. Querying a single database is comparatively trivial to trying to aggregate the same query for all your databases. This isn't going to scale.
Scalability *: As mentioned in 2, you're going to require a special sort of aggregation to query things about your clients, and your app as a whole. The bigger your app gets, the more complex your querying. The other issue is, if one client uses the app a lot more than another, what will you be encouraged to optimise? Your app, the bigger client's database, or the smaller client's? Not forgetting anything you do change has to be copied to all databases.
Backups: You can backup one database easily, just by creating a dump and stashing it somewhere. Get a thousand clients and now you have to run 1000 database dumps, and name them well enough to be able to identify them if one single database corrupts. How will you even know if this happens? Database errors will be localised to that specific one, as opposed to your entire app.
UI: A user signs up or is invited to use your app, and belongs to one particular client. Are you going to save that user account to the client's database? If so, see scalability for the issue of working with that data when the user wants to change their password, or you want to email them. So, do you tell the user to let you know which database they're in so you can find them?
Simplification: You have a database per client and want to just use a single one. How do you merge them all together without significantly breaking things? There'll be primary key conflicts if you use auto incremented IDs; bookmarked URLs will break if you decide to just regenerate the keys; foreign keys across tables will no longer point to the right records. Your data integrity will go down the pan.
You mention 'white label' services that offer their product through custom subdomains. I'm not privy to how these work, but the subdomain is only a basic CNAME or A record in their DNS zonefile. The process of adding these can be automated, and the design of the application and a bit of server configuration can deal with linking these subdomains to the correct accounts and data. They're just URLs, so maybe on the backend, the app doesn't differentiate between:
http://client.example.com
http://example.com/client
Overall though, you may decide that all these problems are things you can and would prefer to deal with. Be warned, however, that by doing so you may be shooting yourself in the foot, and you can gain a lot more from crafting a well-designed single database schema and a well-abstracted front-end.
*#xQbert mentions the very real benefit of scalability with multiple databases. I've amended this answer to clarify that I was more concerned with other aspects.

openam with database

I have a requirement that openam should access users and groups from a MySQL database. In the openam GUI for New Data Store -> Database Repository (Early Access), I could see some configurations related to this. But I am not aware about how to map fields from two or three of MySQL tables (users and groups) to the corresponding attributes of openAM. Also what are the mandatory or optional fields for keeping user and group information? Somebody please point me to good documentation on this.
Also I have a couple of basic queries.
Is it possible to keep policy information in database?
Is it possible to create users, groups and assign policy information from a web application deployed differently (through JSP / servlet). Does the OpenSSO APIs allows to do this.
Thanks.
The Database data store is very primitive (hence it's Early Access), and it is very likely that it won't fit your very specific needs. In that case you can always implement your own Data Store implementation (which can be tricky, I know..), here's a guide for it:
http://docs.forgerock.org/en/openam/10.0.0/dev-guide/index.html#chap-identity-repo-spi
Per your questions:
1) the policies are stored in the configuration store, and I'm not aware of any ways to store them in DataBase
2) yes, there are some APIs which let you perform some changes with the identities stored in the data store, but OpenSSO/OpenAM is not an identity management tool, so it might not be a perfect fit for doing it.
This is pretty simple but sparse documentation makes it a bit tough to do. Check out: http://rahul-ghose.blogspot.com/2014/05/openam-database-connectivity-with-mysql.html
Basically you need to create a MySql database and 2 tables for this. Include all variables you see in the User Configuration dropdown in the Mysql user table, it is okay to remove a few attributes.

Subscription website architecture questions + SQL Server & .NET

I have a few questions about the architecture of a subscription service I am about to embark on and I am looking for some feedback on how best to set it up.
I won’t have a large amount of customers as Basecamp, maybe a few hundred and was wondering what would be a solid architecture for setting up the customer sites. I’m running SQL Server and .NET on a dedicated machine. Should create a new database for each customer as to have control and isolation of data or keep them all in one database?
I am also thinking of creating a sub-domain for each customer as well so modifications can be made to each site as needed. The customer URLs would look like this:
https://customer1.foobar.com
https://customer2.foobar.com
I am going to have the ability to ‘plug-in’ reports that will be uploaded to the site so each customer can customize as needed. Off the top of my head this necessitates having each sub domain on its own code-base for the uploading of these reports.
So on the main site the customer would sign up for their new subscription and I would programmatically create a new directory for the customer from the main code base and then create a sub domain pointing to the new directory for the customer and then finally their database.
Does this sound about right? Am I on the right track? How do other such sites accomplish the same thing?
Thanks for letting me bend your ear for a bit on this.
From a maintenance perspective, having a virtual directory for each customer scares me. Having done something similar, I would create separate domain pointers as you are intimating. Then you can check the referral headers to see what should be displayed. I would probably create one main site template and dynamically brand it for each customer. You can still create separate folders for customer specific reports or if you really need custom pages unique to that customer. I just wouldn't make each their own site.
The advantage of separate sites (including databases) is that the fate on one client isn't bound to all others. It'd be easier to upgrade (trial) to a sub-set before deploying to everyone else. The big issue here, as Scot points out, is time. You'd want to have things as automated as possible (and well tested), etc. It's also easy when a client leaves. You can always just back-up their database and send it to them (for example).
Auto-provisioning new sites and databases isn't easy, and the account that does that will need plenty of privileges - so your security testing will need to be better than usual.
A multi-tenancy approach is good for minimizing your time but you do have to be careful, you don't want customers data getting mixed up.
One approach that will work, within the one app (and database), is to make use of HttpHandlers (MVC framework, perhaps) so that some sort of client identifier is in part of the URL - but the folder doesn't have to physically exist (or virtually in the IIS sense). That way you don't have to worry about getting folder permission correct; but you do have to be careful about correctly identifying clients, their ids, and making sure clients can't make calls that use an id that isn't theirs.
https://www.foobar.com/[clientid]/subscriptions
The advantage of this is it's relatively straight forward: everything is in the application, and you don't have to worry about adding new DNS records, setting directory and/or database permissions, etc.

Should application users be database users?

My previous job involved maintenance and programming for a very large database with massive amounts of data. Users viewed this data primarily through an intranet web interface. Instead of having a table of user accounts, each user account was a real first-class account in the RDBMS, which permitted them to connect with their own query tools, etc., as well as permitting us to control access through the RDBMS itself instead of using our own application logic.
Is this a good setup, assuming you're not on the public intranet and dealing with potentially millions of (potentially malicious) users or something? Or is it always better to define your own means of handling user accounts, your own permissions, your own application security logic, and only hand out RDBMS accounts to power users with special needs?
I don't agree that using the database for user access control is as dangerous others are making it out to be. I come from the Oracle Forms Development realm, where this type of user access control is the norm. Just like any design decision, it has it's advantages and disadvantages.
One of the advantages is that I could control select/insert/update/delete privileges for EACH table from a single setting in the database. On one system we had 4 different applications (managed by different teams and in different languages) hitting the same database tables. We were able to declare that only users with the Manager role were able to insert/update/delete data in a specific table. If we didn't manage it through the database, then each application team would have to correctly implement (duplicate) that logic throughout their application. If one application got it wrong, then the other apps would suffer. Plus you would have duplicate code to manage if you ever wanted to change the permissions on a single resource.
Another advantage is that we did not need to worry about storing user passwords in a database table (and all the restrictions that come with it).
I don't agree that "Database user accounts are inherently more dangerous than anything in an account defined by your application". The privileges required to change database-specific privileges are normally MUCH tougher than the privileges required to update/delete a single row in a "PERSONS" table.
And "scaling" was not a problem because we assigned privileges to Oracle roles and then assigned roles to users. With a single Oracle statement we could change the privilege for millions of users (not that we had that many users).
Application authorization is not a trivial problem. Many custom solutions have holes that hackers can easily exploit. The big names like Oracle have put a lot of thought and code into providing a robust application authorization system. I agree that using Oracle security doesn't work for every application. But I wouldn't be so quick to dismiss it in favor of a custom solution.
Edit: I should clarify that despite anything in the OP, what you're doing is logically defining an application even if no code exists. Otherwise it's just a public database with all the dangers that entails by itself.
Maybe I'll get flamed to death for this post, but I think this is an extraordinarily dangerous anti-pattern in security and design terms.
A user object should be defined by the system it's running in. If you're actually defining these in another application (the database) you have a loss of control.
It makes no sense from a design point of view because if you wanted to extend those accounts with any kind of data at all (email address, employee number, MyTheme...) you're not going to be able to extend the DB user and you're going to need to build that users table anyway.
Database user accounts are inherently more dangerous than anything in an account defined by your application because they could be promoted, deleted, accessed or otherwise manipulated by not only the database and any passing DBA, but anything else connected to the database. You've exposed a critical system element as public.
Scaling is out of the question. Imagine an abstraction where you're going to have tens or hundreds of thousands of users. That's just not going to manageable as DB accounts, but as records in a table it's just data. The age old argument of "well there's onyl ever going to be X users" doesn't hold any water with me because I've seen very limited internal apps become publicly exposed when the business feels it's could add value to the customer or the company just got bought by a giant partner who now needs access. You must plan for reasonable extensibility.
You're not going to be able to share conn pooling, you're not going to be any more secure than if you just created a handful of e.g. role accounts, and you're not necessarily going to be able to affect mass changes when you need to, or backup effectively.
All in there seems to be numerous serious problems to me, and I imagine other more experienced SOers could list more.
I think generally. In your traditional database application they shouldnt be. For all the reason already given. In a traditional database application there is a business layer that handles all the security and this is because there is such a strong line between people who interact with the application, and people who interact with the database.
In this situation is is generally better to manage these users and roles yourself. You can decide what information you need to store about them, and what you log and audit. And most importantly you define access based on pure business rules rather than database rules. Its got nothing to do with which tables they access and everything to do with whether they can insert business action here. However these are not technical issues. These are design issues. If that is what you are required to control then it makes sense to manage your users yourself.
You have described a system where you allow users to query the database directly. In this case why not use DB accounts. They will do the job far better than you will if you attempt to analyse the querys that users write and vet them against some rules that you have designed. That to me sounds like a nightmare system to write and maintain.
Don't lock things down because you can. Explain to those in charge what the security implications are but dont attempt to prevent people from doing things because you can. Especially not when they are used to accessing the data directly.
Our job as developers is to enable people to do what they need to do. And in the situation you have described. Specifically connect to the database and query it with their own tools. Then I think that anything other than database accounts is either going to be insecure, or unneccasarily restrictive.
"each user account was a real first-class account in the RDBMS, which permitted them to connect with their own query tools, etc.,"
not a good idea if the RDBMS contains:
any information covered by HIPAA or Sarbanes-Oxley or The Official Secrets Act (UK)
credit card information or other customer credit info (POs, lines of credit etc)
personal information (ssn, dob, etc)
competitive, proprietary, or IP information
because when users can use their own non-managed query tools the company has no way of knowing or auditing what information was queried or where the query results were delivered.
oh and what #annakata said.
I would avoid giving any user database access. Later, when this starts causing problems, taking away their access becomes very dificult.
At the very least, give them access to a read-only replica of the database so they can't kill your whole company with a bad query.
A lot of database query tools are very advanced these days, and it can feel a real shame to reimplement the world just to add restrictions. And as long as the database user permissions are properly locked down it might be okay. However in many cases you can't do this, you should be exposing a high-level API to the database to insert objects over many tables properly, without the user needing specific training that they should "just add an address into that table there, why isn't it working?".
If they only want to use the data to generate reports in Excel, etc, then maybe you could use a reporting front end like BIRT instead.
So basically: if the users are knowledgeable about databases, and resources to implement a proper front-end are low, keep on doing this. However is the resource does come up, it is probably time to get people's requirements in for creating a simpler, task-oriented front-end for them.
This is, in a way, similar to: is sql server/AD good for anything
I don't think it's a bad idea to throw your security model, at least a basic one, in the database itself. You can add restrictions in the application layer for cosmetics, but whichever account the user is accessing the database with, be it based on the application or the user, it's best if that account is restricted to only the operations the user is allowed.
I don't speak for all apps, but there are a large number I have seen where capturing the password is as simple as opening the code in notepad, using an included dll to decrypt the configuration file, or finding a backup file (e.g. web.config.bak in asp.net) that can be accessed from the browser.
*not a good idea if the RDBMS contains:
* any information covered by HIPAA or Sarbanes-Oxley or The Official Secrets Act (UK)
* credit card information or other customer credit info (POs, lines of credit etc)
* personal information (ssn, dob, etc)
* competitive, proprietary, or IP information*
Not true, one can perfectly manage which data a database user can see and which data it can modify. A database (at least Oracle) can also audit all activities, including selects. To have thousands of database users is also perfectly normal.
It is more difficult to build good secure applications because you have to program this security, a database offers this security and you can configure it in a declarative way, no code required.
I know, I am replying to a very old post, but recently came across same situation in my current project. I was also thinking on similar lines, whether "Application users be Database users?".
This is what I analysed:
Definitely it doesn't make sense to create that big number of application users on database(if your application is going to be used by many users).
Let's say you created X(huge number) of users on database. You are opening a clear gateway to your database.
Let's take a scenario for the solution:
There are two types of application users (Managers and Assistant). Both needs access to database for some transactions.
It's obvious you would create two roles, one for each type(Manager and Assistant) in database. But how about database user to connect from application. If you create one account per user then you would end up linearly creating the accounts on the database.
What I suggest:
Create one database account per Role. (Let's say Manager_Role_Account)
Let your application have business logic to map an application user with corresponding role.(User Tom with Manager role to Manager_Role_Account)
Use the database user(Manager_Role_Account) corresponding to identified role in #2 to connect to database and execute your query.
Hope this makes sense!
Updated: As I said, I came across similar situation in my project (with respect to Postgresql database at back end and a Java Web app at front end), I found something very useful called as Proxy Authentication.
This means that you can login to the database as one user but limit or extend your privileges based on the Proxy user.
I found very good links explaining the same.
For Postgresql below Choice of authentication approach for
financial app on PostgreSQL
For Oracle Proxy Authentication
Hope this helps!
It depends (like most things).
Having multiple database users negates connection pooling, since most libraries handle pooling based on connection strings and user accounts.
On the other hand, it's probably a more secure solution than anything you or I will do from scratch. It leaves security up to the OS and Database server, which I trust much more than myself. However, this is only the case if you go to the effort to configure the database permissions well. If you're using a bunch of OS/db users with the same permissions,it won't help much. You'll still get an audit trail, but that's about it.
All that said, I don't know that I'd feel comfortable letting normal users connect directly to the database with their own tools.
I think it's worth highlighting what other answers have touched upon:
A database can only define restrictions based on the data. Ie restrict select/insert/update/delete on particular tables or columns. I'm sure some databases can do somewhat cleverer things, but they'll never be able to implement business-rule based restrictions like an application can. What if a certain user is allowed to update a column only to certain values (say <1000) or only increase prices, or change either of two columns but not both?
I'd say unless you are absolutely sure you'll never need anything but table/column granularity, this is reason enough by itself.
This is not a good idea for any application where you store data for multiple users in the same table and you don't want one user to be able to read or modify another user's data. How would you restrict access in this case?

Resources