When to Create DB and When to Create Schema - database

This seems a design question but I wanted to know if there is a pattern or design consideration we need to have where we would want to create a Database and not a new schema.
why not create one big database and separate schemas. Under what circumstance should we create a new database.

They are just logical divisions, so for the most part it's a matter of preference. There is one place where it's not a matter of preference: replication.
As of September, 2022, the unit of replication is the database. It's possible to specify which databases you want to replicate, but not which schemas within a database to replicate.
If you plan to replicate, you'll want to think about keeping only the schemas/tables that are important to replicate in one or more databases that get replicated and keep other data in databases that do not get replicated.

Another thought could be, In a large DWH Enterprise Solution,
There can be variety of flavours of tables which You can map to different databases. Sales DB, Master DB, Finance DB for ex. Then Inside DBs, You may want to have schemas for tables, views ,procedures and other object .

Related

MS SQL Server: central database and foreign keys

I'm am currently developing one project of many to come which will be using its own database and also data from a central database.
Example:
the database "accountancy" with all accountancy package specific tables.
the database "personelladministration" with its specific tables
But we also use data which is general and will be used in all projects like "countries", "cities", ...
So we have put these tables in a separate database called "general"
We come from a db2 environment where we could create foreign keys between databases.
However, we are switching to MS SQL server where it is not possible to put foreign keys between databases.
I have seen that a workaround would be to use triggers, but I'm not convinced that is a clean solution.
Are we doing something wrong in our setup? Because it seems right to me to put tables with general data in a separate database instead of having a table "countries" in every database, that seams difficult to maintain and inefficiënt.
What could be a good approach to overcome this?
I would say that countries is not a terrible table to reproduce in multiple databases. I would rather duplicate static data like that than use more elaborate techniques. There is one physical schema per database in sql server and the schema can not be shared. That is why people use replication or triggers for shared data.
I can across this problem a while back. We have one database for authentication, however, those users have to be shared across multiple applications some of which have their own database.
Here is my question on this topic.
We resorted to replication and using an custom Authentication/Registration service agent to keep the data up to data.
Using views, in what Sourav_Agasti suggested in his answer, would be the most straight forward approach for static data. You can create views and indexed views and join data from databases on linked servers.
Create a loopback linked server and then create a view(if required, on each database) which accesses the table in this "central database" through this linked server. There will be a minor performance impact but it more than enough compensates by being very simiplistic.

Linking tables between databases

I’m after a bit of advice on the best way to go about this is SQL server 2008R2 express. I have a number of applications that are in separate databases on the same server. They are all “plugins” that use a central staff/structure list that will be in a separate database. The application is in the process of being migrated from JET.
What I’m looking for is the best way of all the “plugin” databases being able to see the central database and use those tables in standard queries and views etc.
As I’m using express that rules out any replication solution and so far the only option I can think of is to use triggers or a stored procedure to “push” out all the changes to the plugins. The information needs to be populated on a near enough real time basis however the number of changes will be very small maybe up to 100 a day and the biggest table only has about 1000 rows at the moment (the staff names table).
Hopefully that will cover all everything but if anyone needs any more details then just ask
Thanks
Apologies if I've misunderstood, but from your description it sounds like all these databases are hosted on the same instance of SQL Server - it's your mention of replication that makes me uncertain.
Assuming that's the case, you should be able to replace any copies of tables from the central database which are held in the "plugin" databases with views or synonyms which reference the central tables directly, since SQL server allows you to make references between databases on the same server using three-part naming (database_name.schema_name.object_name)
For example, if each plugin db has a table StaffNames, you could replace this with a view by dropping the table, then creating a view:
drop table StaffNames
go
create view StaffNames
as
select * from <centraldbname>.<schema - probably dbo>.StaffNames
go
and your code should continue to work seamlessly, as long as permissions are set up.
Alternatively, you could replace all the references to the shared tables in the plugin databases with three-part name references to the central database, but the view method requires less work.

Postgresql - one database for everyone, or one-database per customer

I'm working on a web-based business application where each customer will need to have their own data (think basecamphq.com type model) For scalability and ease-of-upgrades, I'd prefer to have a single database where each customer gets a filtered version of the data. The problem is how to guarantee that they stay sandboxed to their own data. Trying to enforce it in code seems like a disaster waiting to happen. I know Oracle has a way to append a where clause to every query based on a login id, but does Postgresql have anything similar?
If not, is there a different design pattern I could use (like creating a view of each table for each customer that filters)?
Worse case scenario, what is the performance/memory overhead of having 1000 100M databases vs having a single 1Tb database? I will need to provide backup/restore functionality on a per-customer basis which is dead-simple on a single database but quite a bit trickier if they are sharing the database with other customers.
You might want to look into adding Veil to your PostgreSQL installation.
Schemas plus inherited tables might work for this, create your master table then inherit tables into per-customer schemas which provide a company ID or name field default.
Set the permissions per schema for each customer and set the schema search path per user. Use the same table names in each schema so that the queries remain the same.

Grouping ETL Staging Tables With User Schemas?

I was thinking of putting staging tables and stored procedures that update those tables into their own schema. Such that when importing data from SomeTable to the datawarehouse, I would run a Initial.StageSomeTable procedure which would insert the data into the Initial.SomeTable table. This way all the procs and tables dealing with the Initial staging are grouped together. Then I'd have a Validation schema for that stage of the ETL, etc.
This seems cleaner than trying to uniquely name all these very similar tables, since each table will have multiple instances of itself throughout the staging process.
Question: Is using a user schema to group tables/procs/views together an appropriate use of user schemas in MS SQL Server? Or are user schemas supposed to be used for security, such as grouping permissions together for objects?
This is actually a recommended practice. Take a look at the Microsoft Business Intelligence ETL Design Practices from the Project Real. You will find (download doc from the first link) that they use quite a few schemata to group and identify objects in the warehouse.
In addition to dbo and etl, they also use admin, audit, part, olap and a few more.
I think it's appropriate enough, it doesn't really matter, you could use another database if you liked which is actually what we do.
I'm not sure why you would want a validation schema though, what are you going to do there?
Both the reasons you list (purpose/intent, security) are valid reasons to use schemas. Once you start using them, you should always specify schema when referencing an object (although I'm lazy and never specify dbo).
One trick we use is to have the same-named table in each of several schemas, combined with table partitioning (available in SQL 2005 and up). Load the data in first schema, then when it's validated "swap" the partition into dbo--after swapping the dbo partition into a "dumpster" schema copy of the table. Net Production downtime is measured in seconds, and it's all carefully wrapped in a declared transaction.

schema in sql server 2008

what is the difference between creating ordinary tables using 'dbo' and creating tables using schemas.How this schema works & supports the tables
A schema is just a container for DB objects - tables, views etc. It allows you to structure a very large database solution you might have. As a sample, have a look at the newer AdventureWorks sample databases - they have a number of schemata included, like "HumanResources" and so forth.
A schema can be a security boundary, e.g. you can give or deny certain users access to a schema as a whole. A schema can also be used to keep tables with the same name apart, e.g. you could create a "user schema" for each user of your application, and have a "Settings" table in each of them, holding that user's settings, e.g. "Bob.Settings", "Mary.Settings" etc.
In my experience, schemata are not used very often in SQL Server. It's a way to organize your database objects into containers, but unless you have a huge amount of database objects, it's probably something you won't really use much.
dbo is a schema.
See if this helps.
Schema seems to be a way of categorizing objects (tables/stored procs/views etc).
Think of it as a bucket to organize related objects based on functionality.
I am not sure, how logged in SQL user is tied to a specific schema though.

Resources