Database structure for multi-users web application - database

I'm undertaking a project with a learning purpose. Since this project is compelling to me because of its topic I want to build good foundations and maybe put it live eventual.
Since my project is quite complex, to explain you what my question is I'm gonna use a fiction project that is an agenda application.
This web application will have a calendar where the user can add events and reminders.It will be used by, lets say, 10,000 users and those 10,000 users will add thousands of events and reminders.
My question is which of the two methods would you recommend related to database structure?
Should I create a separate database with reminders and events tables for each user (on user creation) and relate the databases to a user in a separate database
or should I make one table for events, one for reminders and one for users and relate them to one another in a single database?
I haven't done any multi-user web applications so far and I am not familiar with database structures approach when it comes to many users. Please if there are any design patterns that you think of, I would appreciate sharing :)

Here's my opinion:
No, you should not create a separate database for each user. It can't scale. It means that every time you add a user, you have to create a new database? Never.
One database, multiple users - that's what relational databases are born for.
10,000 users is not that large an audience. Each creating thousands of events and reminders would mean 10M events, 10M reminders. That's not considered a large relational database.
You may need to worry about partitioning and purging old records. What kind of policy will you have in place for keeping those events and reminders? What access will users have after a year? Five years? Ten years? Those would be good topics to think about, too.
Get a good book about entity/relationship modeling and read it carefully. Anything modern on Amazon will do.

I used to work with a database where each user data was held in a separate database (your option 1) and believe me it was a nightmare to work with and the company spent enormous amount of resources to consolidate all these databases to one single database and it was not an easy task.
As #duffymo stated one database/multiple users that's what relational databases are for.

Related

Architecture: one or multiple databases for sub customers (web)APP

I've built a winforms application that i'm currently rebuilding into an ASP.NET MVC application using Web API etc. Maybe an app will be added later on.
Assume that I will provide these applications to a few customers.
My applications are made for customer accounting.
So all of my customers wil manage their customers whithin the applications I provide.
That brings me to my question. Should I work with one big database for al my customers, or should I use seperate database for each of my customers? I'd like to ask the same for web app instances, api's etc.
Technicaly I think both options are possible. If it's just a mather of preference, all input is appreciated.
Some pros and cons I could think off:
One database:
Easy to setup/maintain
Install one update for all of my customers
No possibility to restore db for one customer
Not flexible in terms of resource spreading
Performance, this db can get realy large
Multiple databases:
Preformance, databases are smaller sized and can be spread by multiple servers
Easy to restore data if customer made a 'huge mistake'
The ability to provide customer specific needs (not needed atm)
Harder to setup/maintain, every instance needs to be updated seperately.
A kind of gateway/routing thing is needed to route users to the right datbase/app
I would like to know how the 'big companies' approach this.
You seem to be talking about database multi-tenancy, and you are right about the pros and cons.
The answer to this depends a lot on the kind of application you are building and the kind of customers it will have.
I would go with multi-tenant (single DB multiple tenants) database if
Your application is a multi-tenant application.
Your users do not need to store their own data backups.
Your DB schema will not change for each customer (this is implied in multi-tenant applications anyway).
Your tenants/customers will not have a huge amount of individual data.
Your customers don't have government imposed data isolation laws they need to comply with (EU data in EU, US data in US etc.).
And for individual databases pretty much the inverse of all those points.

Building a web application with multiple database instances or just a single instance

I am currently designing a web application where I will have customers signing up as companies. Each company will have its own set of users. As I am designing this I am wondering which approach would work best. I see sites like fogbugz or basecamp which use subdomains. In cases with subdomains do you have a database instance per sub domain? I'm wondering if it is recommended to have a database instance per company or if I should have some kind of company table and manage the company and user data/credentials all from one database.
Which approach is best? Is there literature on this subject (i.e. any web or book)?
thanks in advance!
You have to weigh up your options, as some of this will be a matter of opinion and might not be feasible for your implementation.
That being said, I'd consider the single database approach, for these reasons:
Maintenance: when running a database per registered 'client', you will very easily reach a situation where any changes or upgrades you make to your app's schema have to be applied to every single database instance. This will get ridiculous, fast.
Convenience: You might want analytics and usage stats, or some way to administrate all these databases. Querying a single database is comparatively trivial to trying to aggregate the same query for all your databases. This isn't going to scale.
Scalability *: As mentioned in 2, you're going to require a special sort of aggregation to query things about your clients, and your app as a whole. The bigger your app gets, the more complex your querying. The other issue is, if one client uses the app a lot more than another, what will you be encouraged to optimise? Your app, the bigger client's database, or the smaller client's? Not forgetting anything you do change has to be copied to all databases.
Backups: You can backup one database easily, just by creating a dump and stashing it somewhere. Get a thousand clients and now you have to run 1000 database dumps, and name them well enough to be able to identify them if one single database corrupts. How will you even know if this happens? Database errors will be localised to that specific one, as opposed to your entire app.
UI: A user signs up or is invited to use your app, and belongs to one particular client. Are you going to save that user account to the client's database? If so, see scalability for the issue of working with that data when the user wants to change their password, or you want to email them. So, do you tell the user to let you know which database they're in so you can find them?
Simplification: You have a database per client and want to just use a single one. How do you merge them all together without significantly breaking things? There'll be primary key conflicts if you use auto incremented IDs; bookmarked URLs will break if you decide to just regenerate the keys; foreign keys across tables will no longer point to the right records. Your data integrity will go down the pan.
You mention 'white label' services that offer their product through custom subdomains. I'm not privy to how these work, but the subdomain is only a basic CNAME or A record in their DNS zonefile. The process of adding these can be automated, and the design of the application and a bit of server configuration can deal with linking these subdomains to the correct accounts and data. They're just URLs, so maybe on the backend, the app doesn't differentiate between:
http://client.example.com
http://example.com/client
Overall though, you may decide that all these problems are things you can and would prefer to deal with. Be warned, however, that by doing so you may be shooting yourself in the foot, and you can gain a lot more from crafting a well-designed single database schema and a well-abstracted front-end.
*#xQbert mentions the very real benefit of scalability with multiple databases. I've amended this answer to clarify that I was more concerned with other aspects.

How to design a DB for several projects

Im wondering what will be the best way to organize my DB. Let me explain:
Im starting a new "big" project. This big project will be composed by few litle ones. In general the litle projects are not related to each other, they are just features of the big one.
One thing that all the projects have in common is the users that are going to use it.
So my questions are:
Should i create different DB for each one of the litle projects
(currently each project will contain 4-5 tables)
How to deal with the users? Should I create one DB for all the users
or should i
duplicate the users table in every DB? Have in mind that the
information about the users is used a lot in every litle project,
it's NOT only for identification purposes.
Thanks in advance for your advice.
This greatly depends on the database you choose to use.
If these "sub-projects" are designed to work as one coherent unit, then I strongly recommend you keep it all in the same database. One backup, one restore, one unit.
For organizational purposes, if you are using a database which supports it, select a different Schema per project. PostgreSQL and SQL Server are two databases (among others) which support this effortlessly.
In the case of a database like MySQL, I recommend you pick a short prefix for each subproject and prefix all tables accordingly. "P1_Customer" for example.
Shared data would go in it's own schema or prefix, like Global or something like that.
Actually, this was one of the many reasons we switched our main database from MySQL to PostgreSQL. We've been heavy users of both, and I really appreciate the features that PostgreSQL offers. SQL Server, if you are in a windows environment, is a great database IMO as well.
If the little projects are "features of the big one" then I don't see a reason why you wouldn't want just one user table for the main project. The way you setup the question makes this seem true "If there is a user A in little project 1, then there must be a user A in the 'big' project." If that is true, you should likely have the users in the big db instead of doing duplication unless you have more qualifying details.
i think the proper answer is 'it depends'.
Starting your organization down the path of single centralized system is good on many levels. I think in general i would recommend this.
however:
if you are going to have dramatically different development schedules, or dramatically different user experiences with the various sub projects, then you may be better off keeping them separate.
I'd have a look at OpenID or some other single sign-on protocol depending on the nature of your application. OpenID includes a mechanism called "attribute exchange", which allows applications to retrieve profile information from the OpenID provider.
This allows you to create a central user profile repository, with an authentication scheme, and have your individual apps query that repository for profile information.
The question as to how to design your database is hard to answer without more information. In most architectures, "features" within an application tend to be closely linked - "users" are related to "accounts" are related to "organisations" etc.
I'd recommend looking at the foreign key relationships to answer this question. If you have lots of foreign keys, build a single database for all tables. If you have "clusters" of foreign keys, and you want to have a different life cycle for each application (assuming the clusters map neatly to the applications), consider separate databases.
By "life cycle", I mean mostly the development lifecycle - app 1 might deploy weekly, app 2 monthly, app 3 once only and then be frozen.

Database tables - how many database?

How many databases are needed for a social website? I have my tech team working on developing a social site but all their tables are in 1 database. I wanted to create separate table sets for user data, temporary tables, etc and thinking maybe have one separate database only for critical data, etc but I am not a tech person and now sure how this works? The site is going to be a local reviews website.
This is what happens when management tries to make tech decisions...
The simple answer, as always, is as few as possible.
The slightly more complicated answer is that once your begin to push the limits of your server and begin to think about multiple servers with master/slave replication then your may want your frequent write tables separated from your seldom write tables which will lower the master-slave update requirements.
If you start using seperate databases you can also run into an with you backup / restore strategy. If you have 5 databases and backup all five, what happens when you need to restore one of them, do you then need to restore all five?
I would opt for the fewest number of databases.
The reason you would want to have multiple databases is for scaling-out to multiple machines. In the context of a "social application" where large volume / high availability is a concern. If you anticipate the need to scale out to multiple machines to handle high volumes then the breakout of tables should be those that logically need to stay together.
So, for example, maybe you want to keep tables related to a specific subject area (maybe status updates) together in one database and other tables that are related to a different subject area (let's say user's picture libraries) together in a different database.
There are logical and performance reasons to keep tables in separate physical or logical databases.
What is the reason that you want it in different databases?
You could just put all tables in one database without a problem, even with for example multiple installations of an open source package. In that case you can use table prefixes.
Unless you are developing a really BIG website, one database is the way to proceed (by the way, did you consider the possible issues that may raise when working with various databases?).
If you are worried about performance, you can always configure different tablespaces on several storage devices in order to improve timings.
If you are worried about security, just increase it (better passwords, no direct root login, no port forwarding, avoid tunneling, etc.)
I am not a tech person only doing the functional analysis but I own the project so I need to oversee the tech team. My reason to have multiple database is security and performance.
Since this is going to be a new startup, there is no money to invest into strong security or getting the database designed flawless. Plus there are currently no backup policies in place so:
1) I want to separate critical data like user password/basic profile info, then separate out user media (photos they upload on their profile) and then the user content. Then separate out the system content. Current design is to have to layers of tables: Master tables for entire system and module tables for each individual module.
2) Performance: There are a lot of modules being designed and this is a data intensive social site with lots of reporting / analytic being builtin so lots of read/writes. Maybe better to distribute load across database based on purpose?
Since there isn't much funding hence I want to get it right the first time with my investment so the database can scale & work well until revenue comes in to actually invest in getting it right. Ofcourse that could be maybe 6 months away and say a million users away too.
Oh & there is plan to add staging/production mode also so seperate or same database?
You'll be fine sticking with using one database for now. Your developers can isolate/seperate application data by making use of database schema. Working with multiple databases can quickly become a journey through a world of pain and is to be avoided unless its absolutely crucial.

Database Design for multiple users site

I am required to work on a php project that requires the database to cater to multiple users. Generally, the idea is similar to what they have for carbonmade or basecamp, or even wordpress mu. They cater to multiple users, whom are also owners of their accounts. And if they were to cancel/terminate their account, anything on the pages/database would be removed.
I am not quite sure how should I design the database? Should it be:
separate tables for individual user account
separate databases for individual user account
or otherwise?
Kindly advise me for the best approach to this issue. Thank you very much.
How many users are we talking about?
Offhand, I like the idea of having a separate database for each user account. There are many advantages:
You can keep the schema (and your application code) simple
If a user ever wanted a copy of their database you could just dump it out and give it to them
You can easily take care of security by restricting access to each database to a given user account
You may be able to scale out more easily by adding more database servers, since you are using separate databases (there would be no common tables used by all users)
Of course, this could be a bit painful for you if you need to deploy updates to hundreds of databases, but that's what automated scripting is for.
The idea of having separate tables for each user seems like a coding nightmare. Each time you reference a shared table you will have to modify the name to match the current user's copy.

Resources