Multiple companies in same database - database

I'm working on a system which for every "company" has their own "users" and their own "bills". That scenario is better in performance and management? Handle all companies in the same database and link everything to an idempresa, or database for each client?

This is called multi tenancy architecture and each customer is a tenant. There are various strategies to deal with it and each one might bring potential problems.
Having a separate database for each tenant is an option that provides data separation and do not require you to add a column to identify each tenant in your tables and queries, but also has the downside to keep multiple databases up to date.
Having a column in each table of a single database to identify your tenants is also a good strategy, but then it brings problems when scaling and managing different features for different customer for example.
You need to study all available strategies and decides which one is best based on your requirements and pain points.

Putting a tenant data in a separate Database is a straight forward approach and less painful option but then in a long run, when your product gets wildly successful, maintaining this database will become a nightmare.
On the Other hand keeping all the Tenants data in a single database could also make your application non scalable and less performable. The better approach would be the combination of both, the decision of making the choice between these two is completely based on the type, usage and size of the customer.
In certain cases, you may need to provision a separate database for a particular module or feature of your application may be for security or to isolate the specific data alone. I have written an article on these lines; kindly have a look at http://blog.techcello.com/2012/07/database-sharding-scaling-data-in-a-multi-tenant-environment/

I think the scaling problem of multi-tenant in a single database can be overcome by proper planning up-front. Plan to make it easy to migrate a tenant and their data to another database anything they become big enough to justify it.
If you can automate this migration, based on tenant ID, in each table then it should be easy and safe. I'd just make sure I tested it often as development of new features are going on.
You can mitigate the risks of multi-tenant on one database. You can't really do much when there are multiple databases. You can only be diligent and disciplined to make sure all the databases stay in sync.
Good luck!!!

This is an old thread, but it's worth mentioning this for others with this question who may come across this post in the future.
I've had great success on projects in the past by using PostgreSQL and putting the global tables in the "public" schema (like users, groups, etc.) and the same set of tables for each tenant in their own separate schemas.
For example:
For every tenant that's added to the system, a new schema is created with a standard set of tables for the application:
CREATE SCHEMA tenant1;
CREATE TABLE tenant1.products (...);
CREATE TABLE tenant1.orders (...);
etc.
Each tenant's schema would have its own isolated section within the database with the same set of tables that every other tenant has but filled with their own data.
In the default "public" schema you'd have global "users" and "tenants" tables (along with tables for things like groups and access control lists). Every user belongs only to a single tenant. Upon login, the tenant for that user is looked up and from that point forward any time you connect to the database you set it to use that tenant's schema:
SET search_path TO tenant1, public;
Once the schema search_path is set, all your SQL queries can be written as if you're working with a single database with tables named "products", "orders", and so forth (along with the tables in the "public" schema). So you can just use something like "SELECT * FROM products" and it would get the products belonging to this user's tenant.

Related

CakePhp Multiple tenants - single DB versus multiple DBs

We are working on an application in CakePHP that will be used by multiple companies. We want to ensure performance, scalability, code manageability and security for our application.
Our current beta version creates a new database for each customer. When a new company joins the site we run a SQL script to create a blank database.
This has the following advantages:
- Better security (companies users are separated from each other)
- We can set the database via the subdomain (IE: monkey.site.com, uses the site_monkey database)
- Single code base.
- Performance for SQL queries is generally quite good as data is split across smaller databases.
Now unfortunately this has many disadvantages
- Manageability: changes to database have to happen across all existing databases
- The SQL script method of creation is clunky and not as reliable as we would like
- We want to allow users to login from the home page (EG. www.site.com) but we cant currently do this as the subdomain determines what database to use.
- We would like a central place to keep metrics/customer usage.
So we are torn/undecided as to what is the best solution to our database structure for our application.
Currently we see three options:
- Keep multiple database design
- Merge all companies into one DB and identify each by a 'companyId'
- Some kind of split model, where certain tables are in a 'core database' and others are in a customer specific database.
Can you guys offer some of your precious advise on how you think we should best do this?
Any feedback / info would be greatly appreciated.
Many thanks,
kSeudo
Just my suggestion:
I think better you keep the customer related data in different databases and authentication related data in a common database So when a user logs in you should have an entry with domain that user belongs to and redirect to that domain and access the corresponding database and data.
Again your concern of changes to the database, You need to implement the changes in each databases separately. I think there is some advantages to this also. Some customers may ask for few changes according to their process. So this can be easily managed if you are keeping separate databases for different customers.

Database schema clarification

Unfortunately, the term "schema" has come to take on different definitions for different databases. We're using SQL Server 2008 R2, and with that in mind, I have a better understanding thanks to some other questions here with people asking similar questions. However, before I begin making the database, I want to be sure I have this right for my specific scenario.
Basically it's a database for various departments of the company. For example, Administration will manage employees with a bunch of tables related to employee management. Marketing will have a lot of marketing related tables. And tech support will have a lot of tech support related tables. These "groups" will probably never interact with one another, but they're all part of the same project, so I'm putting them all in one database, rather than three separate databases.
Am I correct in understanding that this means I would want three different schemas? So that for Administration, for example, the tables would be named:
Administration.Employees
Administration.VacationDays
Administration.EmployeeAddresses
etc.
and then for tech support, for example:
Techsupport.Clients
Techsupport.OpenIssues
Techsupport.ClosedIssues
etc.
And then am I correct in understanding that the PURPOSE of this, instead of just having every table in the dbo schema, is for A) organization purposes, and B) permission purposes (users with Techsupport schema access shouldn't be able to access the Administration schema, for instance). The idea I've come to in my head is that schemas in the SQL Server definition is that a schema is just like a virtual folder that groups related tables together.
I think this is right, after all the similar questions that I've read, but I just really want to be sure I'm on the right path before I get too far in and realize I'm doing it completely wrong.
Is throwing everything into the dbo schema and calling a day discouraged / not intended? Should you use a schema, even for small databases that don't necessarily need multiple schemas?
Thanks.
Schemas support two primary purposes:
security container. Permissions can be granted on schemas and such permissions apply to all objects in the schema. Eg. GRANT SELECT ON SCHEMA::Administration TO [foo\bar]; grants the SELECT permission to any table in the schema, including future added tables.
namespace. You can deploy your application in the schema [CptSupermarkt] and know that your app has a very low probability of a name conflict with other applications.
The prevalent use is the first one because most apps are not concerned with side-by-side deployment with other applications and usually assume ownership of an entire database (if not an entire instance). However there are types of applications (eg. audit tools and monitoring apps) that use the namespace aspect of schemas (or, at least, most should use it...).

Database flexibility and privacy, hidden structure, software compatibility and 'public' permissions

I'm one of those that recently decided to migrate from MySQL to PostgreSQL and with it a lot of old habits are being torn apart. However there is functionality from MySQL I would like to preserve in PostgreSQL.
So... topics:
User should have ability to create tables under a restricted namespace.
Tables of one user should not be visible to other users by default (both data, structure, stored procedures and whatnot).
Optionally the user should be given the right to GRANT permissions to other users.
Default permission to new users is to have no permission (read nothing, write even less)
Maintain compatibility with applications that are not schema aware.
Point 1:
Under MySQL the solution in place was to allow the user to create databases under the criteria 'username_%'. Under PostgreSQL I thought of having one database per user such that they can create as many schemas as they want. However there is the limitation of not being able to do joins across databases, only across schemas on the same database.
The possibility of having all as PostgreSQL schemas under the same database is not completely discarded. But then it suffers from the next point...
Point 2:
After reading this question I was inclined to think that the only way to make data completely private was to use different databases. Still I can't seem to figure out how to do it and on the other hand it conflicts with the ability to do the joins mentioned in the previous point.
Point 3:
Is this even possible or do you need the 'Create roles' privilege and create a new role for the given table/schema.
Point 4:
Again, is this possible? From what I read it feels like I'm fighting the default 'public' behavior, but still I would like to have the users seeing nothing unless an admin gives them access to the information.
Point 5:
Some of the programs I use with MySQL, on which I have no direct control of the actions they perform on the database, are not schema aware. This means they simply ignore the schema layer. For this PostgreSQL provides the 'public' schema as default. However this is still a bit awkward in some cases.
It also means that by default I need one independent database per software/tool or else I need to trick the system by setting search_path to some predefined schema on a per user (role) basis.
So those are the options/solutions I've found so far. I'm fine with having to use the search_path for point 5 and sacrificing joins between tables/schemas in different databases for the sake of privacy (points 1 and 2), but I would still like to know what is the best solution to the above problems and what are the best ways to put them in practice.
With that said, I'm all ears.
PS: Links to information on how to accomplish the mentioned above are also welcome.
The solution we ended up taking is the following:
Point 1:
One database per user. User can create as many tables and schema as he wants. Joins across databases are not possible. The alternative is to retrieve subsets and manage the results on the client, obviously not the most efficient way.
Point 2:
This can be accomplished by defining a specific ownership and permission for a given database and removing the default "public" behavior. With this, only users that belong to allowed groups or are the owner itself can access the content.
Note: PostgreSQL uses multiple level permissions which means that even if the database is owned by someone, tables can be owned by someone else.
Point 3:
Can be done with WITH GRANT OPTION.
Point 4:
There is no automated way to do this. The only way to ensure this is by restricting "public" access to all existing databases.
Point 5:
Using search_path on a per user basis is the only way to do it, using multiple users to access different schema (when needed). There is obviously the issue that a schema unaware application cannot "reach" other schema if no user with appropriate search_path exists.

Best-Practices for using schemas in SQL Server (2008)

I can see in the AdventureWorks database that different schemas are used to group tables. Why is this done (security, ...?) and are there best-practices I can find?
thx, Lieven Cardoen
As a manager of Business Intelligence, we rely on schema for logical grouping and managing security. Here are some cases as to how we use schema:
LOGICAL ORGANIZATION
We have a general database that is loaded by SSIS packages solely for staging data before we load our operational data store (ODS). In this database, with the exception of the schema all objects are indentical in structure (table names, column names, data types, nullability, etc.) to their original source. We use the schema to indicate the original source system of the table. In some rare instances, two different databases have tables with the same name and schema allows us to continue to use the original name in the staging database.
In every database on our BI servers each team member has a test_username schema. When we create test objects in a database, this makes it easy to keep track of who made the object. It also makes it a lot easier to purge the test objects later since everyone knows who made what. Frankly, just knowing that we made it is usually enough to know it can be deleted safely, especially when we can't remember when or why we made it!
In our data controller database, we rely on schema to separate different types of processes between reports, etl, and generic resources.
In our star schema data warehouse, all objects are devided into dimension and fact schemas.
When we push data to other departmental servers, we make all BI objects on their servers use the schema bi. This makes it REALLY easy to know bi loads and maintains the table even though it isn't on our server. If the target server isn't a 2008/2005 SQL Server box, then we prefix the table with bi_.
When it gets down to it, we use schema for logical organization anytime we WOULD have appended a prefix or suffix to an object to help organize it in the absence of schema. Having said this, there are a few instances where we don't use schema on our BI servers. In our WorkingDB, everything is dbo. Our WorkingDB is used like TempDB to create temporary tables, but these tables are temporary tables that we know we will create everytime an ETL process runs. The special property of WorkingDB is that we don't ever backup the database and all ETL processes that use the database must be able to recreate their objects from scratch in the absence of the table. In this instance, we felt using schema didn't add ANY organizational value since we don't actually use the objects outside of their temporary ETL process.
SECURITY
Since we are a BI group, we don't generally build and support our own applications. We almost exclusively use other people's applications and bring data from their back-end databases to our server. However, we do have one database called bi_applications that is the back-end for a variety of small CRUD applications. These applications are usually data entry forms that we provide to the business so that they can capture data we would otherwise have to maintain in BI. It is a way of getting data that should be in production applications into BI while we wait for our low priority application enhancements to gather dust in the future development lists. Each application has a separate schema and the application account used to update the underlying tables ONLY has access to objects of the associated schema. This makes it really easy to understand, secure, and maintain the separate applications.
In a few instances, I have let power users have direct database access to our tables or stored procedures. We rely on using schema combined with roles to secure the objects. We grant permissions to the schema and users are added to roles. This allows us to easily understand which objects are used by whom without having to dig through roles to figure it out.
In short, we use schema for security purposes when we probably would have considered separating the objects out into their own databases and when we expect an application or user outside of BI to access our databases.
Although these aren't best business practices for application developers, I hope my bi use-cases may help you think of some of the ways to use schema in your end of the business.

Which database implementations allow sandboxing users in separate databases?

Can anyone tell me if there are RDBMSs that allow me to create a separate database for every user so that there is full separation of users' data?
Are there any?
I know I can add UID to every table but this solution has its own problems (for example per user database schema changes are impossible).
Doesnt MySQL, PostgreSQL, Oracle and so on and so on allow you to do that?. There's the grant statements to control ACLs
I would imagine most (all?) databases allow you to create a user which you could then grant database level access to? SQL server certainly does.
Another simple solution if you don't need the databases to be massive or scalable, say for teaching SQL to students or having many testers work against their own database to isolate problems is SQLite, that way the whole database is a single file (per user), and each user cannot possibly screw up or interfere with other users.
They can even mail you the databases, or install them anywhere, say at home and at work with no internet required.
MS SQLServer2005 is one which can be used for multiple users.An instance can be created
if you have any, run the previlegs and use one user per instance
Oracle lets you create a separate schema (set of tables, indexes, functions, etc) for individual users. This is good if they should have separate different tables. Creating a new user could be a very expensive operation as you would be making new tables. Updating is a nightmare as well, as you need to update the model for each user.
If you want everyone to have the same set of tables, but only able to view their own records then you could use Fine Grain Access Control or Virtual Private Database features to do this.

Resources