From Database System Concepts, by Silberschatz et al:
4.5.7 Schemas, Catalogs, and Environments
Like early file systems, early database systems also had a single name
space for all relations. Users had to coordinate to make sure they did
not try to use the same name for different relations. Contemporary
database systems provide a three-level hierarchy for naming relations.
The top level of the hierarchy consists of catalogs, each of which can
contain schemas. SQL objects such as relations and views are contained
within a schema. (Some database implementations use the term
“database" in place of the term catalog.)
In order to perform any actions on a database, a user (or a program)
must first connect to the database. The user must provide the user name
and usually, a password for verifying the identity of the user. Each
user has a default catalog and schema, and the combination is unique
to the user. When a user connects to a database system, the default
catalog and schema are set up for the connection; this corresponds to
the current directory being set to the user’s home directory when the
user logs into an operating system.
To identify a relation uniquely, a three-part name may be used, for
example, catalog5.univ schema.course We may omit the catalog
component, in which case the catalog part of the name is considered to
be the default catalog for the connection. Thus if catalog5 is the
default catalog, we can use univ schema.course to identify the same
relation uniquely.
A relation has a schema, which is the collection of all the
attributes of the relation. The "schema" in the above quote seems
to correspond to more than one relations. Does "schema" in the
above quote mean the same as the schema of a relation?
What is the relation between catalogs and databases? Is the relation
between catalogs and databases one-to-one?
What do the catalogs and schemas look like in mysql, postgresql, or
SQL Server?
Thanks.
Your first sentence in # 1 makes no sense.
A table/relation like “person” has attributes/columns like “name”, “phone”, and “email”.
Tables are grouped together in a namespace known as a schema. So a schema such as “warehouse” can have a table named “person” while another schema such as “sales” can also have a table coincidentally named “person”. Each catalog has one or more schema, each schema carrying a name such as “warehouse” and “sales” seen here.
A schema commonly acts a security boundary, besides being a namespace. As far as I know, that is an implementation detail, not required by the SQL standard.
The word “schema” is also commonly used in a different, more casual and general way, to describe the tables & columns design choices made to fit the needs of an application. See first comment by IMSoP below. A schema in the casual sense might involve any number of catalogs, schemas, tables, and columns in the formal SQL Standard sense.
As for # 2, your quotation explains that. “Catalog” and “database” are synonyms. The word “catalog” is used formally by the SQL standard.
For # 3, advanced databases striving to implement the SQL standard typically support all levels defined by the standard: cluster > catalog > schema > table. This includes both Postgres and Microsoft SQL Server.
H2 Database Engine supports separate databases, each being a catalog with schemas, but no cluster grouping the catalogs/databases together.
MySQL is more limited and does not support the full hierarchy, from what I can tell in my limited searching of MySQL documentation.
For more info, see this related Question: What's the difference between a catalog and a schema in a relational database?
Related
When designing databases, I have been following the conventions of the Microsoft AdventureWorks sample database. They use schemas to logically separate groups of tables, e.g. Person, Production or Sales. It makes a lot of sense from a security point of view as well as from an organizational pov.
However, I have some tables that are used in multiple schemas. For example, a Country table that contains all countries. It wouldn't make sense to assign a sepecific schema to it, e.g. Person.Country or Production.Country as it is used in tables of different schemas.
Therefore, which schema do I assign it to?
you can use the "dbo" schema, its the default schema for sql-server and many others.
I was assigned the task to create a simple Database Management System in a class so I looked up Postgres and noticed that the CLI tool (psql) has commands (\d and \l) that output information about the database and columns of a table in the form of tables like when you do a SELECT. So my question is If Postgres manages user tables inside system tables? and that way when you do \d or \l you are actually doing a SELECT on those system tables. This is just to understand if that would be a good way of managing tables in a database or not and just use regular data structures like lists.
It does indeed. You can run psql with -E to see the queries it is using.
Then check the online manuals
The items to search for are "system catalogs" and "INFORMATION_SCHEMA". The latter is a standard way of describing database schemas and should mostly work between different RDBMS.
Yes, Postgres uses tables that it creates to manage the tables that you create.
There is an entire chapter in the documentation explaining. To quote:
The system catalogs are the place where a relational database management system stores schema metadata, such as information about tables and columns, and internal bookkeeping information. PostgreSQL's system catalogs are regular tables.
As mentioned in the other Answer, the SQL standard requires metadata be provided in some table structures as defined within the standard. These must be housed within a schema named exactly INFORMATION_SCHEMA. Postgres provides that schema and its prescribed tables, but implements them as a view on the actual system tables. See the chapter on INFORMATION_SCHEMA in Postgres documentation.
You can access the metadata, such as to get a list of all the tables you have defined, or get a list of all the columns you defined in a particular table. To do so, perform a query in SQL using SELECT like any other query.
For portability, meaning to write code that works in other database systems in addition to Postgres, query against INFORMATION_SCHEMA.
For additional details not required by the SQL standard, and for Postgres-specific info, query against the Postgres-specific system tables. Their names all start with pg_.
I'm new to SQL Server, just some questions on schema. my uni database textbook says database schema is something like a database's structure described in a formal language.
But it seems like SQL Server Schema is more like ownership. Why one thing can have two concepts?
Yes. The word "schema" means two different things.
"schema" in English means "plan, or technical design, or model", and as applied to databases it means the design of all the tables, columns, foreign keys, etc in a database. This is common in database literature going way back.
So the short answer is that "schema" means "the design of a set of tables", but many database systems can manage multiple, independent designs, or schemas. So a the word "schema" came to mean also "the subdivision of a database containing a set of related tables".
Consider a database server whose job today is to house one database. Likely the database will be moved in the future to another database instance which houses multiple databases & schemas.
Let's pretend the app/project is called Invoicer 2.0. The database is called AcmeInvoice. The database holds all the invoice, customer, and product information. Here's a diagram of the actors and their roles and behaviour.
The schema(s) will largely be used to easily assign permissions to roles. The added benefit here is that the objects aren't under dbo, and that the objects & permissions can be ported to another machine in the future.
Question
What conventions do you use when naming the schema?
Is it good form to name the schema the same as the database?
I would think that if your schema name ends up being the same as your database schema, then you are just adding redundancy to your database. Find objects in your database that have common scope or purpose and create a schema to relect that scope. So for example if you have an entity for Invoices, and you have some supporting lookup tables for invoice states, etc, then put them all in an invoice schema.
As a generally rule of thumb, I would try to avoid using a name that reflects the application name, database name or other concrete/physical things because they can change, and find a name that conceptually represents the scope of your objects that will go into the schema.
Your comment states that "the schemas will largely be used to easily assign permissions to roles". Your diagram shows specific user types having access to some/all tables or some/all stored procs. I think trying to organize objects conceptually into schemas and organize them from a security standpoint into schemas are conflicting things. I am in favour of creating roles in sql server to reflect the types of users, and grant those roles access to the specific objects that each user type needs, as apposed to granting the role or user access the schema to build your security framework..
Why would you name the schema the same as the database? This means all database objects fall under the same schema. If this is the case, why have a schema at all?
Typically schema's are used to group objects within a common scope of activity or function. For example, given what you've described, you might have an Invoice schema, a Customer schema and a Product schema. All Invoice related objects would go into the Invoice schema, all Customer related objects would go into the Customer schema, and the same for Products.
We often will use a Common schema as well which includes objects that might be common to our entire application.
I would call the database AcmeInvoice (or another suitable name) and the schema Invoicer2.
My reasons are as follows: Acmeinvoice means I am grouping all of that applications objects/data together. It can therefore be moved as one unit to other machines (a backup/restore or unattach/attach).
The schema would be Invoicer2. Applications change, maybe in the future you will have Invoicer21 (you would create a schema), or perhaps a reporting module or system (Reports schema).
I find that the use of schemas allows me to separate data/procedures in one database into different groups which make it easier to adminster permissions.
What is the importance of schema in sql server?
Where this schema help me?
Is it important for security reasons?
Yes, the primary purpose of SQL schema was -is- to facilitate security management: define who [which principals] can access what [which database objects]. This was made particularly easier starting with SQL 2005 when the schema stopped being directly tied to the owner.
Another use of schema is to serve as a namespace, that is preventing name clashes between objects from different schemas.
The original use of this was to allow multiple [interactive, i.e. ad-hoc like] users of a given database to create their own tables or stored procedures (or other objects), without having to worry about the existence of similarly named objects possibly introduced by other users.
The Namespace-like nature of schema can also be put to use in a planned database setting, i.e. one when a single architect designs the database structure in a way which provides distinct type of access, and indeed different behaviors, for distinct user groups.
They partition your database to make management easier.
This is from MSDN:
A schema is now a distinct namespace
that exists independently of the
database user who created it. In other
words, a schema is simply a container
of objects. A schema can be owned by
any user, and its ownership is
transferable.
Here's the page that came from: http://msdn.microsoft.com/en-us/library/ms190387.aspx
In relation to security it makes it simpler to assign permissions as you can grant someone access to a schema without exposing your entire database to them.
What a schema is changed with the release of SQL Server 2005 and later - I think of it as an additional security layer as well as a container of objects.
This is quite a good resource:
http://msdn.microsoft.com/en-us/library/ms190387(SQL.90).aspx
Schema is mainly used to Manage several logical entities in one physical database.
Schemas offer a convenient way to separate database users from database object owners. They give DBA’s the ability to protect sensitive objects in the database, and also to group logical entities together.
This is especially advantageous in situations where those objects are often utilized as a unit by applications. For example, a hotel-management system may be broken down into the following logical entities or modules: Rooms, Bar/Restaurant, and Kitchen Supplies.
These entities can be stored as three separate physical databases. Using schemas however, they can be combined as three logical entities in one physical database. This reduces the administrative complexity of managing three separate databases.
Source