I was doing some research into the optimal DB setup for a Salesforce like application (webapp with sensitive data for multiple 10.000's of customers, each customer can have multiple USER accounts).
My idea is, is that the split database setup (different DB for each customer) is the best since its the only way to make sure data is really isolated in terms of security, and because its not just multi customer, but each customer can have multiple users that login to work on the data. However...is having a DB instance with 10.000 'small' DB's really gonna work?
I dont really feel much for the shared schema setup since you will end up with 10's of thousands of tables, which doesn't seem very manageable.
The 1 big DB setup with just using a customerID column per table seems the easiest to implement, but not really secure...and joining tables will be quite a task as I presume.
What are your thoughts on this scenario? The seperate DB's-scenario seems best to me from a scalability and security perspective...but what do I know :D
Thanks!
IMHO, you can opt to use a different database for each tenant or for a set of tenants. That will give better scalability, good data access speeds, security compliance etc.. In case you have to do a tenant data back up, it is faster and can be automated.
The one consideration would be in accessing the data from both the shared and the shared database..
Even each tenant can opt to have a separate database for maintaining the user info so that the basic tenant authentication will come from the shared db and further the separate db can be used to authenticate the users.
Please share your thoughts on this.
Related
I've built a winforms application that i'm currently rebuilding into an ASP.NET MVC application using Web API etc. Maybe an app will be added later on.
Assume that I will provide these applications to a few customers.
My applications are made for customer accounting.
So all of my customers wil manage their customers whithin the applications I provide.
That brings me to my question. Should I work with one big database for al my customers, or should I use seperate database for each of my customers? I'd like to ask the same for web app instances, api's etc.
Technicaly I think both options are possible. If it's just a mather of preference, all input is appreciated.
Some pros and cons I could think off:
One database:
Easy to setup/maintain
Install one update for all of my customers
No possibility to restore db for one customer
Not flexible in terms of resource spreading
Performance, this db can get realy large
Multiple databases:
Preformance, databases are smaller sized and can be spread by multiple servers
Easy to restore data if customer made a 'huge mistake'
The ability to provide customer specific needs (not needed atm)
Harder to setup/maintain, every instance needs to be updated seperately.
A kind of gateway/routing thing is needed to route users to the right datbase/app
I would like to know how the 'big companies' approach this.
You seem to be talking about database multi-tenancy, and you are right about the pros and cons.
The answer to this depends a lot on the kind of application you are building and the kind of customers it will have.
I would go with multi-tenant (single DB multiple tenants) database if
Your application is a multi-tenant application.
Your users do not need to store their own data backups.
Your DB schema will not change for each customer (this is implied in multi-tenant applications anyway).
Your tenants/customers will not have a huge amount of individual data.
Your customers don't have government imposed data isolation laws they need to comply with (EU data in EU, US data in US etc.).
And for individual databases pretty much the inverse of all those points.
First please excuse me for my grammar mistakes.
Ok, this is what I already know :-),
I want to use EF and MVC 4, UI with angularJs, I need a Database per user \ group of users,
my application growth may come to 5000+ users, they all have also a shared resource which is a single
database, when the user search for something the results will come both from the shared resource
and from the user own database.
Performance is extremely important.
In my research I found that EF can connect to different databases but i couldn't find any proper way of doing so without writing tons of code.
Scenarios :
New user registers, the system builds a new database for him.
New user logs in, the system returns data from his database and the shared database.
New user logs in, BUT, the system database got upgraded, users db should too.
Now I know that there is no easy method to achieve all of my goals,
but can you please direct me to what suits me best?
Again sorry for my English!
Thank you! :-)
IMHO, we have worked in several SaaS Applications that have been using a shared database [central repository] that will contain all the user [tenant] data and that there will be an application database [tenant based] for every user group.
This will work with ease in EF and there will be no performance overheads. You should not be using cross database queries and instead focus on the optimization of the EF code that you may have in the data access layer and then you can have separate services that will handle the task of merging the user data from the shared and separate databases.
May be you should analyze the application and then find the data that may be non-frequently updated and cache them and get them rendered using a distributed cache like Appfabric.
With respect to the synchronization of the User db and the centralized database, in the service tier, you can get this job done by wrapping these calls in a .Net Transaction scope and then the preserve the atomicity.
Please post your understanding and any further clarifications in my reply.
I am currently designing a web application where I will have customers signing up as companies. Each company will have its own set of users. As I am designing this I am wondering which approach would work best. I see sites like fogbugz or basecamp which use subdomains. In cases with subdomains do you have a database instance per sub domain? I'm wondering if it is recommended to have a database instance per company or if I should have some kind of company table and manage the company and user data/credentials all from one database.
Which approach is best? Is there literature on this subject (i.e. any web or book)?
thanks in advance!
You have to weigh up your options, as some of this will be a matter of opinion and might not be feasible for your implementation.
That being said, I'd consider the single database approach, for these reasons:
Maintenance: when running a database per registered 'client', you will very easily reach a situation where any changes or upgrades you make to your app's schema have to be applied to every single database instance. This will get ridiculous, fast.
Convenience: You might want analytics and usage stats, or some way to administrate all these databases. Querying a single database is comparatively trivial to trying to aggregate the same query for all your databases. This isn't going to scale.
Scalability *: As mentioned in 2, you're going to require a special sort of aggregation to query things about your clients, and your app as a whole. The bigger your app gets, the more complex your querying. The other issue is, if one client uses the app a lot more than another, what will you be encouraged to optimise? Your app, the bigger client's database, or the smaller client's? Not forgetting anything you do change has to be copied to all databases.
Backups: You can backup one database easily, just by creating a dump and stashing it somewhere. Get a thousand clients and now you have to run 1000 database dumps, and name them well enough to be able to identify them if one single database corrupts. How will you even know if this happens? Database errors will be localised to that specific one, as opposed to your entire app.
UI: A user signs up or is invited to use your app, and belongs to one particular client. Are you going to save that user account to the client's database? If so, see scalability for the issue of working with that data when the user wants to change their password, or you want to email them. So, do you tell the user to let you know which database they're in so you can find them?
Simplification: You have a database per client and want to just use a single one. How do you merge them all together without significantly breaking things? There'll be primary key conflicts if you use auto incremented IDs; bookmarked URLs will break if you decide to just regenerate the keys; foreign keys across tables will no longer point to the right records. Your data integrity will go down the pan.
You mention 'white label' services that offer their product through custom subdomains. I'm not privy to how these work, but the subdomain is only a basic CNAME or A record in their DNS zonefile. The process of adding these can be automated, and the design of the application and a bit of server configuration can deal with linking these subdomains to the correct accounts and data. They're just URLs, so maybe on the backend, the app doesn't differentiate between:
http://client.example.com
http://example.com/client
Overall though, you may decide that all these problems are things you can and would prefer to deal with. Be warned, however, that by doing so you may be shooting yourself in the foot, and you can gain a lot more from crafting a well-designed single database schema and a well-abstracted front-end.
*#xQbert mentions the very real benefit of scalability with multiple databases. I've amended this answer to clarify that I was more concerned with other aspects.
I am required to work on a php project that requires the database to cater to multiple users. Generally, the idea is similar to what they have for carbonmade or basecamp, or even wordpress mu. They cater to multiple users, whom are also owners of their accounts. And if they were to cancel/terminate their account, anything on the pages/database would be removed.
I am not quite sure how should I design the database? Should it be:
separate tables for individual user account
separate databases for individual user account
or otherwise?
Kindly advise me for the best approach to this issue. Thank you very much.
How many users are we talking about?
Offhand, I like the idea of having a separate database for each user account. There are many advantages:
You can keep the schema (and your application code) simple
If a user ever wanted a copy of their database you could just dump it out and give it to them
You can easily take care of security by restricting access to each database to a given user account
You may be able to scale out more easily by adding more database servers, since you are using separate databases (there would be no common tables used by all users)
Of course, this could be a bit painful for you if you need to deploy updates to hundreds of databases, but that's what automated scripting is for.
The idea of having separate tables for each user seems like a coding nightmare. Each time you reference a shared table you will have to modify the name to match the current user's copy.
If I am building a CRM web application to sell as a membership service, what is the best method to design and deploy the database?
Do I have 1 database that houses 100s of records per table or deploy multiple databases for different clients?
Is it really an issue to use a single database since I believe sites like Flickr use them?
Multiple clients is called "multi-tenant". See for example this article "Multi-Tenant Data Architecture" from Microsoft.
In a situation like a CRM system, you will probably need to have separate instances of your database for each customer.
I say this because if you'd like larger clients, most companies have security policies in place regarding customer data. If you store their customer data in the same database as another customer, you're running the risk of exposing one companies confidential data to another company (a competitor, etc.).
Sites like Flickr don't have to worry about this as much since the majority of us out on the Interwebs don't have such strict policies regarding our personal data.
Long term it is easiest to maintain one database with multiple clients' data in it. Think about deployment, backup, etc. However, this doesn't keep you from having several instances of this database, each containing a subset of the full client dataset. I'd recommend to grow the number of databases after you have established the usefulness/desirability of your product. Having complex infrastructure is not necessary if you have no traffic....
So, I'd just put a client id in the relevant tables and smile when client 4 comes in and the extent of your new deployment is one insert statement.