Table column name design - database

Let's say I have the following tables: Customer and Staff.
Customer (CusID, CusName, CusAddres, CusGender)
Staff (StaID, StaName, StaAddress, StaGender)
vs
Customer(ID, Name, Address, Gender)
Staff(ID, Name, Address, Gender)
Which design is preferred and why?

SQL employs a concept known as domain name integrity which means that the names of objects have a scope given by their container.
Column names have to be unique, but only within the context of the table that contains the columns. Table names have to be unique, but only within the context of the schema that contains the tables, etc.
When you query columns you need to reference the schema, table and colum that you are interested in, unless one or more of these can be inferred. Unless your query is so simple that it only references one table, you're going to need to reference the table name directly or by using an alias, e.g. Customer.ID or C.ID from Customer C, etc.
The first option is a throw-back to technical requirements for uniqueness of all column names, which applied to old ISAM databases and to languages like COBOL in the 1960s and 70s. This got dragged along for no good reason into dBase in the 1980s and has stuck as a convention well into the relational and object DBMS eras. Resist this outdated convention.
The second approach is much simpler and more readable. Simpler and more readable code is easier to write and much easier to maintain.

Keys are "propagated" down foreign keys, so it's useful if they keep their names constant in all the resulting "copies". It just makes the database schema clearer: you don't have to look into FK definition1 to see where a particular field came from - you can do that by just glancing at its name.
On the other hand, non-key fields are not propagated and there is no particular reason to keep their names unique outside their respective tables.
So, I'd recommend a hybrid approach:
Prefix key field names to keep them unique among all tables and avoid any need for renaming propagated fields.
Don't prefix non-key field names to keep them shorter, even though this might lead to some name repetition in different tables.
For example:
Customer (CustomerID, Name, Addres, Gender)
Staff (StaffID, Name, Address, Gender)
And if you happen to have a junction table between the two2, it'll look like this...
CustomerStaff(CustomerID PK FK1, StaffID PK FK2)
...so no need for renames (that would be necessary if both parent keys were named ID) and it's immediately clear where CustomerID and StaffID came from. Also, I personally don't like shortening table names in the prefix (hence CustomerID and not CusID), as this makes naming even more "mechanical" and predictable.
1 Potentially multiple levels of them!
2 Just an example. Whether it makes sense is another matter.

ID is a SQL antipattern (http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557/ref=sr_1_1?s=books&ie=UTF8&qid=1343835938&sr=1-1&keywords=sql+antipatterns), never name a column ID.
I would use:
Customer(CustomerID, Name, Address, Gender)
Staff(StaffID, Name, Address, Gender)

I would say that the second option is better. If you decide to change the table name, you won't have to change the names of all the columns. Also when you write your SELECT statements, you usually use an alias anyway ("select sta.name, sta.adress from staff sta...")
In general, when in doubt, always pick a simpler solution.

Every Industry has its own standard. In those both style of designing's are normal. It is recommended to use First one.
Customer(CusID, CusName, CusAddres, CusGender)
Staff(StaID, StaName, StaAddress, StaGender)
Because it is using table name symbol like Customer to Cus and Staff to Sta. It helps
the project good to maintenance.
Some Organizations are also using vch for VARCHAR data type,int for INT data type.
Like
Customer(intCusID, vchCusName, vchCusAddres, vchCusGender)
Staff(intStaID, vchStaName, vchStaAddress, vchStaGender)

Related

How to ensure column name does not match with any SQL Server reserved word

I'm using a query command to create a SQL Server table. I don't want the column name do match any of the SQL Server keywords (ex: use, create,...). How to check the column names (in query string) do not match any SQL Server keywords?
Update 1:
Of course, except for creating a keyword list, then go hand-in-hand comparison.
Since you're asking about naming, let me give you a piece of advice which will lead to one possible answer to your question:
If you're not sure, just get into the habit of always surrounding object names in [SquareBraces], then it doesn't matter too much what you call your objects.
But having said that, it's definitely best to follow a good convention, either the convention common in your workplace, or find another sensible convention to follow (sometimes it seems there are more opinions about naming conventions than there are people coding them!).
My preferences (FWIW) put simply are:
Table names that reflect the entity that a single row contains, eg, [Order], [OrderLine], [Customer], etc. To answer your original question, you'll notice that ORDER is a keyword here, but as soon as you put square braces around it the parser knows it's an object. Works every time, and you don't have to make contrived or redundant names for the table that holds Order records, for example CustomerOrder.
Composite table names for many-many linking tables reflect both tables being linked where it makes sense, such as [Customer_Contact], [Product_ProductCategory], etc.
Primary keys are always called simply [Id], regardless which entity it is a primary key for, unless there's a very good technical reason not to, which is rare.
Foreign keys are named after the table and column of the PK they refer to, eg. [CustomerId].
As to why I name tables in the singular, rather than think of a table as a collection of, say "Customers", I think of it as a repository where objects of the type "Customer" are stored. This works particularly well when using some ORMs (I like Dapper in C#, but it doesn't work so well with some others where pluralism is almost forced), so you'll have to work out what works best for you and what you're most comfortable with.
Good luck!
While writing the script to create a table, a column name that matches a keyword will change in color either pink, blue, grey. Just encase the column in question with [].
Example:
CREATE TABLE MyTable([DateTime] datetime)
Not sure if there's a system table in system databases where you could query this from but dropping it here it case it may be of use to you.
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/reserved-keywords-transact-sql?view=sql-server-2017

Polymorphic ORM database pattern

I remember when - a long time ago - I was messing around with the Java ActiveObjects ORM, I came across a database pattern it claimed to support.
However, it is very difficult to find the pattern's name, by search for the general idea, thus I would really appreciate it if someone could give me the name of this pattern, and some thoughts on the "cleanness" of using it.
The pattern was defined as such:
Table:
reference_type <enum>
reference <integer>
...
... where the value of the field reference_type would determine the type (and thus the table) to which was being referred. Thus:
User:
location_type <l&l, address, city, country>
location <integer>
...
... where depending on the value of the location_type field, the foreign key location would refer to either the l&l, address, city or country table.
You're having difficulty finding it because it's not a real (in the sense of widely adopted and encouraged) database design pattern.
Stay away from patterns like this. While ORM's make mapping database tables to types easier, tables are not types, and vice versa. While it's not clear what the model you've described is supposed to do, you should not have columns that serve as fake foreign keys to multiple tables (when I say "fake", I mean that you're storing a simple identifier value that corresponds to the primary key of another table, but you can't actually define the column as a foreign key).
Model your database to represent the data, model your objects to represent the process, and use your ORM and intermediate layers to do the translation; don't try to push the database into your code, and don't push your code into the database.
Edit in reponse to comment
You're mixing database and OO terminology; while I'm not familiar with the syntax you're using to define that function, I'm assuming it's an instance function on the User type called getLocation that takes no parameters and returns a Location object. Databases don't support the concepts of instance (or any type-based) functions; relational databases can have user-defined functions, but these are simple procedural functions that take parameters and return either values or result sets. They do not correspond to particular tables or field in any way, other than the fact that you can use them within the body of the function.
That being said, there are two questions to answer here: how to do what you've asked, and what might be a better solution.
For what you've asked, it sounds like you have a supertype-subtype relationship, which is a standard database design pattern. In this case, you have a single supertype table that represents the parent:
Location
---------------
LocationID (PK)
...other common attributes
(Note here that I'm using LocationID for the sake of simplicity; you should have more specific and logical attributes to define the primary key, if possible)
Then you have one or more tables that define subtypes:
Address
-----------
LocationID (PK, FK to Location)
...address-specific attributes
Country
-----------
LocationID (PK, FK to Location)
...country-specific attributes
If a specific instance of Location can only be one of the subtypes, then you should add a discriminator value to the parent table (Location) that indicates which of the subtypes it corresponds to. You can use CHECK constraints to ensure that only valid values are in this field for a given row.
In the end, though, it sounds like you might be better served with a hybrid approach. You're fundamentally representing two different types of locations, from what I can see:
Coordinate-based locations (L&L)
Municipal/Postal/Etc.-based locations (Country, City, Address), and each of these is simply a more specific version of the previous
Given this, a simple model would look like this:
Location
------------
LocationID (PK)
LocationType (non-nullable) ('C' for coordinate, 'P' for postal)
LocationCoordinate
------------------
LocationID (PK; FK to Location)
Latitude (non-nullable)
Longitude (non-nullable)
LocationPostal
------------------
LocationID (PK, FK to Location)
Country (non-nullable)
City (nullable)
Address (nullable)
Now the only problem that remains is that we have nullable columns. If you want to keep your queries simple but take (justified!) flak from people about leaving nullable columns, then you can leave it as-is. If you want to go to what most people would consider a better-designed database, you can move to 6NF for our two nullable columns. Doing this will also have the nice side-effect of giving us a little more control over how these fields are populated without having to do anything extra.
Our two nullable fields are City and Address. I am going to assume that having an Address without a City would be nonsense. In this case, we remove these two attributes from the LocationPostal table and create two more tables:
LocationPostalCity
------------------
LocationID (PK; FK to LocationPostal)
City (non-nullable)
LocationPostalCityAddress
-------------------------
LocationID (PK; FK to LocationPostalCity)
Address (non-nullable)
Seems to me that city and country would be part of the address table, and that L&L wouldn't be mutually exclusive with address (you might have both...), so, why limit yourself like that to one or the other?
Further more, this would prevent the location column from enforcing referential integrity, would it not, since it wouldn't always reference the same table?

What's more readable naming conventions for lookup tables?

We always name lookup tables - such as Countries,Cities,Regions ... etc - as below :
EntityName_LK OR LK_EntityName ( Countries_LK OR LK_Countries )
But I ask if any one have more better naming conversions for lookup tables ?
Edit:
We think to make postfix or prefix to solve like a conflict :
if we have User tables and lookup table for UserTypes (ID-Name) and we have a relation many to many between User & UserTypes that make us a table which we can name it like Types_For_User that may make confusion between UserTypes & Types_For_User So we like to make lookup table UserTypes to be like UserTypesLK to be obvious to all
Before you decide you need the "lookup" moniker, you should try to understand why you are designating some tables as "lookups" and not others. Each table should represent an entity unto itself.
What happens when a table that was designated as a "lookup" grows in scope and is no longer considered a "lookup"? You are either left with changing the table name which can be onerous or leaving it as is and having to explain to everyone that a given table isn't really a "lookup".
A common scenario mentioned in the comments related to a junction table. For example, suppose a User can have multiple "Types" which are expressed in a junction table with two foreign keys. Should that table be called User_UserTypes? To this scenario, I would first say that I prefer to use the suffix Member on the junction table. So we would have Users, UserTypes, UserTypeMembers. Secondly, the word "type" in this context is quite generic. Does a UserType really mean a Role? The term you use can make all the difference. If UserTypes are really Roles, then our table names become Users, Roles, RoleMembers which seems quite clear.
Here are two concerns for whether to use a prefix or suffix.
In a sorted list of tables, do you want the LK tables to be together or do you want all tables pertaining to EntityName to appear together
When programming in environments with auto-complete, are you likely to want to type "LK" to get the list of tables or the beginning of EntityName?
I think there are arguments for either, but I would choose to start with EntityName.
Every table can become a lookup table.
Consider that a person is a lookup in an Invoice table.
So in my opinion, tables should just be named the (singular) entity name, e.g. Person, Invoice.
What you do want is a standard for the column names and constraints, such as
FK_Invoice_Person (in table invoice, link to person)
PersonID or Person_ID (column in table invoice, linking to entity Person)
At the end of the day, it is all up to personal preference (if you can get away with dictating it) or team standards.
updated
If you have lookups that pertain only to entities, like Invoice_Terms which is a lookup from a list of 4 scenarios, then you could name it as Invoice_LK_Terms which would make it appear by name grouped under Invoice. Another way is to have a single lookup table for simple single-value lookups, separated by the function (table+column) it is for, e.g.
Lookups
Table | Column | Value
There is only one type of table and I don't believe there is any good reason for calling some tables "lookup" tables. Use a naming convention that works equally for every table.
One area where table naming conventions can help is data migration between environments. We often have to move data in lookup tables (which constrain values which may appear in other tables) along with schema changes, as these allowed value lists change. Currently we don't name lookup tables differently, but we are considering it to prevent the migration guy asking "which tables are lookup tables again?" every time.

Modelling inheritance in a database

For a database assignment I have to model a system for a school. Part of the requirements is to model information for staff, students and parents.
In the UML class diagram I have modelled this as those three classes being subtypes of a person type. This is because they will all require information on, among other things, address data.
My question is: how do I model this in the database (mysql)?
Thoughts so far are as follows:
Create a monolithic person table that contains all the information for each type and will have lots of null values depending on what type is being stored. (I doubt this would go down well with the lecturer unless I argued the case very convincingly).
A person table with three foreign keys which reference the subtypes but two of which will be null - in fact I'm not even sure if that makes sense or is possible?
According to this wikipage about django it's possible to implement the primary key on the subtypes as follows:
"id" integer NOT NULL PRIMARY KEY REFERENCES "supertype" ("id")
Something else I've not thought of...
So for those who have modelled inheritance in a database before; how did you do it? What method do you recommend and why?
Links to articles/blog posts or previous questions are more than welcome.
Thanks for your time!
UPDATE
Alright thanks for the answers everyone. I already had a separate address table so that's not an issue.
Cheers,
Adam
4 tables staff, students, parents and person for the generic stuff.
Staff, students and parents have forign keys that each refer back to Person (not the other way around).
Person has field that identifies what the subclass of this person is (i.e. staff, student or parent).
EDIT:
As pointed out by HLGM, addresses should exist in a seperate table, as any person may have multiple addresses. (However - I'm about to disagree with myself - you may wish to deliberately constrain addresses to one per person, limiting the choices for mailing lists etc).
Well I think all approaches are valid and any lecturer who marks down for shoving it in one table (unless the requirements are specific to say you shouldn't) is removing a viable strategy due to their own personal opinion.
I highly recommend that you check out the documentation on NHibernate as this provides different approaches for performing the above. Which I will now attempt to poorly parrot.
Your options:
1) One table with all the data that has a "delimiter" column. This column states what kind of person the person is. This is viable in simple scenarios and (seriously) high performance where the joins will hurt too much
2) Table per class which will lead to duplication of columns but will avoid joins again, so its simple and a lil faster (although only a lil and indexing mitigates this in most scenarios).
3) "Proper" inheritence. The normalised version. You are almost there but your key is in the wrong place IMO. Your Employee table should contain a PersonId so you can then do:
select employee.id, person.name from employee inner join person on employee.personId = person.personId
To get all the names of employees where name is only specified on the person table.
I would go for #3.
Your goal is to impress a lecturer, not a PM or customer. Academics tend to dislike nulls and might (subconciously) penalise you for using the other methods (which rely on nulls.)
And you don't necessarily need that django extension (PRIMARY KEY ... REFERENCES ...) You could use an ordinary FOREIGN KEY for that.
"So for those who have modelled inheritance in a database before; how did you do it? What method do you recommend and why?
"
Methods 1 and 3 are good. The differences are mostly in what your use cases are.
1) adaptability -- which is easier to change? Several separate tables with FK relations to the parent table.
2) performance -- which requires fewer joins? One single table.
Rats. No design accomplishes both.
Also, there's a third design in addition to your mono-table and FK-to-parent.
Three separate tables with some common columns (usually copy-and-paste of the superclass columns among all subclass tables). This is very flexible and easy to work with. But, it requires a union of the three tables to assemble an overall list.
OO databases go through the same stuff and come up with pretty much the same options.
If the point is to model subclasses in a database, you probably are already thinking along the lines of the solutions I've seen in real OO databases (leaving fields empty).
If not, you might think about creating a system that doesn't use inheritance in this way.
Inheritance should always be used quite sparingly, and this is probably a pretty bad case for it.
A good guideline is to never use inheritance unless you actually have code that does different things to the field of a "Parent" class than to the same field in a "Child" class. If business code in your class doesn't specifically refer to a field, that field absolutely shouldn't cause inheritance.
But again, if you are in school, that may not match what they are trying to teach...
The "correct" answer for the purposes of an assignment is probably #3 :
Person
PersonId Name Address1 Address2 City Country
Student
PersonId StudentId GPA Year ..
Staff
PersonId StaffId Salary ..
Parent
PersonId ParentId ParentType EmergencyContactNumber ..
Where PersonId is always the primary key, and also a foreign key in the last three tables.
I like this approach because it makes it easy to represent the same person having more than one role. A teacher could very well also be a parent, for example.
I suggest five tables
Person
Student
Staff
Parent
Address
WHy - because people can have multiple addesses and people can also have multiple roles and the information you want for staff is different than the information you need to store for parent or student.
Further you may want to store name as last_name, Middle_name, first_name, Name_suffix (like jr.) instead of as just name. Belive me you willwant to be able to search on last_name! Name is not unique, so you will need to make sure you have a unique surrogate primary key.
Please read up about normalization before trying to design a database. Here is a source to start with:
http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx
Super type Person should be created like this:
CREATE TABLE Person(PersonID int primary key, Name varchar ... etc ...)
All Sub types should be created like this:
CREATE TABLE IF NOT EXISTS Staffs(StaffId INT NOT NULL ,
PRIMARY KEY (StaffId) ,
CONSTRAINT FK_StaffId FOREIGN KEY (StaffId) REFERENCES Person(PersonId)
)
CREATE TABLE IF NOT EXISTS Students(StudentId INT NOT NULL ,
PRIMARY KEY (StudentId) ,
CONSTRAINT FK_StudentId FOREIGN KEY (StudentId) REFERENCES Person(PersonId)
)
CREATE TABLE IF NOT EXISTS Parents(PersonID INT NOT NULL ,
PRIMARY KEY (PersonID ) ,
CONSTRAINT FK_PersonID FOREIGN KEY (PersonID ) REFERENCES Person(PersonId)
)
Foreign key in subtypes staffs,students,parents adds two conditions:
Person row cannot be deleted unless corresponding subtype row will
not be deleted. For e.g. if there is one student entry in students
table referring to Person table, without deleting student entry
person entry cannot be deleted, which is very important. If Student
object is created then without deleting Student object we cannot
delete base Person object.
All base types have foreign key "not null" to make sure each base
type will have base type existing always. For e.g. If you create
Student object you must create Person object first.

Database, Table and Column Naming Conventions? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Whenever I design a database, I always wonder if there is a best way of naming an item in my database. Quite often I ask myself the following questions:
Should table names be plural?
Should column names be singular?
Should I prefix tables or columns?
Should I use any case in naming items?
Are there any recommended guidelines out there for naming items in a database?
I recommend checking out Microsoft's SQL Server sample databases:
https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks
The AdventureWorks sample uses a very clear and consistent naming convention that uses schema names for the organization of database objects.
Singular names for tables
Singular names for columns
Schema name for tables prefix (E.g.: SchemeName.TableName)
Pascal casing (a.k.a. upper camel case)
Late answer here, but in short:
Plural table names: My preference is plural
Singular column names: Yes
Prefix tables or columns:
Tables: *Usually* no prefixes is best.
Columns: No.
Use any case in naming items: PascalCase for both tables and columns.
Elaboration:
(1) What you must do. There are very few things that you must do a certain way, every time, but there are a few.
Name your primary keys using "[singularOfTableName]ID" format. That is, whether your table name is Customer or Customers, the primary key should be CustomerID.
Further, foreign keys must be named consistently in different tables. It should be legal to beat up someone who does not do this. I would submit that while defined foreign key constraints are often important, consistent foreign key naming is always important
You database must have internal conventions. Even though in later sections you'll see me being very flexible, within a database naming must be very consistent . Whether your table for customers is called Customers or Customer is less important than that you do it the same way throughout the same database. And you can flip a coin to determine how to use underscores, but then you must keep using them the same way. If you don't do this, you are a bad person who should have low self-esteem.
(2) What you should probably do.
Fields representing the same kind of data on different tables should be named the same. Don't have Zip on one table and ZipCode on another.
To separate words in your table or column names, use PascalCasing. Using camelCasing would not be intrinsically problematic, but that's not the convention and it would look funny. I'll address underscores in a moment. (You may not use ALLCAPS as in the olden days. OBNOXIOUSTABLE.ANNOYING_COLUMN was okay in DB2 20 years ago, but not now.)
Don't artifically shorten or abbreviate words. It is better for a name to be long and clear than short and confusing. Ultra-short names is a holdover from darker, more savage times. Cus_AddRef. What on earth is that? Custodial Addressee Reference? Customer Additional Refund? Custom Address Referral?
(3) What you should consider.
I really think you should have plural names for tables; some think singular. Read the arguments elsewhere. Column names should be singular however. Even if you use plural table names, tables that represent combinations of other tables might be in the singular. For example, if you have a Promotions and an Items table, a table representing an item being a part of a promotion could be Promotions_Items, but it could also legitimately be Promotion_Items I think (reflecting the one-to-many relationship).
Use underscores consistently and for a particular purpose. Just general tables names should be clear enough with PascalCasing; you don't need underscores to separate words. Save underscores either (a) to indicate an associative table or (b) for prefixing, which I'll address in the next bullet.
Prefixing is neither good or bad. It usually is not best. In your first db or two, I would not suggest using prefixes for general thematic grouping of tables. Tables end up not fitting your categories easily, and it can actually make it harder to find tables. With experience, you can plan and apply a prefixing scheme that does more good than harm. I worked in a db once where data tables began with tbl, config tables with ctbl, views with vew, proc's sp, and udf's fn, and a few others; it was meticulously, consistently applied so it worked out okay. The only time you NEED prefixes is when you have really separate solutions that for some reason reside in the same db; prefixing them can be very helpful in grouping the tables. Prefixing is also okay for special situations, like for temporary tables that you want to stand out.
Very seldom (if ever) would you want
to prefix columns.
Ok, since we're weighing in with opinion:
I believe that table names should be plural. Tables are a collection (a table) of entities. Each row represents a single entity, and the table represents the collection. So I would call a table of Person entities People (or Persons, whatever takes your fancy).
For those who like to see singular "entity names" in queries, that's what I would use table aliases for:
SELECT person.Name
FROM People person
A bit like LINQ's "from person in people select person.Name".
As for 2, 3 and 4, I agree with #Lars.
I work in a database support team with three DBAs and our considered options are:
Any naming standard is better than no standard.
There is no "one true" standard, we all have our preferences
If there is standard already in place, use it. Don't create another standard or muddy the existing standards.
We use singular names for tables. Tables tend to be prefixed with the name of the system (or its acronym). This is useful if the system complex as you can change the prefix to group the tables together logically (ie. reg_customer, reg_booking and regadmin_limits).
For fields we'd expect field names to be include the prefix/acryonm of the table (i.e. cust_address1) and we also prefer the use of a standard set of suffixes ( _id for the PK, _cd for "code", _nm for "name", _nb for "number", _dt for "Date").
The name of the Foriegn key field should be the same as the Primary key field.
i.e.
SELECT cust_nm, cust_add1, booking_dt
FROM reg_customer
INNER JOIN reg_booking
ON reg_customer.cust_id = reg_booking.cust_id
When developing a new project, I'd recommend you write out all the preferred entity names, prefixes and acronyms and give this document to your developers. Then, when they decide to create a new table, they can refer to the document rather than "guess" what the table and fields should be called.
No. A table should be named after the entity it represents.
Person, not persons is how you would refer to whoever one of the records represents.
Again, same thing. The column FirstName really should not be called FirstNames. It all depends on what you want to represent with the column.
NO.
Yes. Case it for clarity. If you need to have columns like "FirstName", casing will make it easier to read.
Ok. Thats my $0.02
I hear the argument all the time that whether or not a table is pluralized is all a matter of personal taste and there is no best practice. I don't believe that is true, especially as a programmer as opposed to a DBA. As far as I am aware, there are no legitimate reasons to pluralize a table name other than "It just makes sense to me because it's a collection of objects," while there are legitimate gains in code by having singular table names. For example:
It avoids bugs and mistakes caused by plural ambiguities. Programmers aren't exactly known for their spelling expertise, and pluralizing some words are confusing. For example, does the plural word end in 'es' or just 's'? Is it persons or people? When you work on a project with large teams, this can become an issue. For example, an instance where a team member uses the incorrect method to pluralize a table he creates. By the time I interact with this table, it is used all over in code I don't have access to or would take too long to fix. The result is I have to remember to spell the table wrong every time I use it. Something very similar to this happened to me. The easier you can make it for every member of the team to consistently and easily use the exact, correct table names without errors or having to look up table names all the time, the better. The singular version is much easier to handle in a team environment.
If you use the singular version of a table name AND prefix the primary key with the table name, you now have the advantage of easily determining a table name from a primary key or vice versa via code alone. You can be given a variable with a table name in it, concatenate "Id" to the end, and you now have the primary key of the table via code, without having to do an additional query. Or you can cut off "Id" from the end of a primary key to determine a table name via code. If you use "id" without a table name for the primary key, then you cannot via code determine the table name from the primary key. In addition, most people who pluralize table names and prefix PK columns with the table name use the singular version of the table name in the PK (for example statuses and status_id), making it impossible to do this at all.
If you make table names singular, you can have them match the class names they represent. Once again, this can simplify code and allow you to do really neat things, like instantiating a class by having nothing but the table name. It also just makes your code more consistent, which leads to...
If you make the table name singular, it makes your naming scheme consistent, organized, and easy to maintain in every location. You know that in every instance in your code, whether it's in a column name, as a class name, or as the table name, it's the same exact name. This allows you to do global searches to see everywhere that data is used. When you pluralize a table name, there will be cases where you will use the singular version of that table name (the class it turns into, in the primary key). It just makes sense to not have some instances where your data is referred to as plural and some instances singular.
To sum it up, if you pluralize your table names you are losing all sorts of advantages in making your code smarter and easier to handle. There may even be cases where you have to have lookup tables/arrays to convert your table names to object or local code names you could have avoided. Singular table names, though perhaps feeling a little weird at first, offer significant advantages over pluralized names and I believe are best practice.
I'm also in favour of a ISO/IEC 11179 style naming convention, noting they are guidelines rather than being prescriptive.
See Data element name on Wikipedia:
"Tables are Collections of Entities, and follow Collection naming guidelines. Ideally, a collective name is used: eg., Personnel. Plural is also correct: Employees. Incorrect names include: Employee, tblEmployee, and EmployeeTable."
As always, there are exceptions to rules e.g. a table which always has exactly one row may be better with a singular name e.g. a config table. And consistency is of utmost importance: check whether you shop has a convention and, if so, follow it; if you don't like it then do a business case to have it changed rather than being the lone ranger.
our preference:
Should table names be plural?
Never. The arguments for it being a collection make sense, but you never know what the table is going to contain (0,1 or many items). Plural rules make the naming unnecessarily complicated. 1 House, 2 houses, mouse vs mice, person vs people, and we haven't even looked at any other languages.
Update person set property = 'value' acts on each person in the table.
Select * from person where person.name = 'Greg' returns a collection/rowset of person rows.
Should column names be singular?
Usually, yes, except where you are breaking normalisation rules.
Should I prefix tables or columns?
Mostly a platform preference. We prefer to prefix columns with the table name. We don't prefix tables, but we do prefix views (v_) and stored_procedures (sp_ or f_ (function)). That helps people who want to try to upday v_person.age which is actually a calculated field in a view (which can't be UPDATEd anyway).
It is also a great way to avoid keyword collision (delivery.from breaks, but delivery_from does not).
It does make the code more verbose, but often aids in readability.
bob = new person()
bob.person_name = 'Bob'
bob.person_dob = '1958-12-21'
... is very readable and explicit. This can get out of hand though:
customer.customer_customer_type_id
indicates a relationship between customer and the customer_type table, indicates the primary key on the customer_type table (customer_type_id) and if you ever see 'customer_customer_type_id' whilst debugging a query, you know instantly where it is from (customer table).
or where you have a M-M relationship between customer_type and customer_category (only certain types are available to certain categories)
customer_category_customer_type_id
... is a little (!) on the long side.
Should I use any case in naming items?
Yes - lower case :), with underscores. These are very readable and cross platform. Together with 3 above it also makes sense.
Most of these are preferences though. - As long as you are consistent, it should be predictable for anyone that has to read it.
Take a look at ISO 11179-5: Naming and identification principles
You can get it here: http://metadata-standards.org/11179/#11179-5
I blogged about it a while back here: ISO-11179 Naming Conventions
I know this is late to the game, and the question has been answered very well already, but I want to offer my opinion on #3 regarding the prefixing of column names.
All columns should be named with a prefix that is unique to the table they are defined in.
E.g. Given tables "customer" and "address", let's go with prefixes of "cust" and "addr", respectively. "customer" would have "cust_id", "cust_name", etc. in it. "address" would have "addr_id", "addr_cust_id" (FK back to customer), "addr_street", etc. in it.
When I was first presented with this standard, I was dead-set against it; I hated the idea. I couldn't stand the idea of all that extra typing and redundancy. Now I've had enough experience with it that I'd never go back.
The result of doing this is that all of the columns in your database schema are unique. There is one major benefit to this, which trumps all arguments against it (in my opinion, of course):
You can search your entire code base and reliably find every line of code that touches a particular column.
The benefit from #1 is incredibly huge. I can deprecate a column and know exactly what files need to be updated before the column can safely be removed from the schema. I can change the meaning of a column and know exactly what code needs to be refactored. Or I can simply tell if data from a column is even being used in a particular portion of the system. I can't count the number of times this has turned a potentially huge project into a simple one, nor the amount of hours we've saved in development work.
Another, relatively minor benefit to it is that you only have to use table-aliases when you do a self join:
SELECT cust_id, cust_name, addr_street, addr_city, addr_state
FROM customer
INNER JOIN address ON addr_cust_id = cust_id
WHERE cust_name LIKE 'J%';
My opinions on these are:
1) No, table names should be singular.
While it appears to make sense for the simple selection (select * from Orders) it makes less sense for the OO equivalent (Orders x = new Orders).
A table in a DB is really the set of that entity, it makes more sense once you're using set-logic:
select Orders.*
from Orders inner join Products
on Orders.Key = Products.Key
That last line, the actual logic of the join, looks confusing with plural table names.
I'm not sure about always using an alias (as Matt suggests) clears that up.
2) They should be singular as they only hold 1 property
3) Never, if the column name is ambiguous (as above where they both have a column called [Key]) the name of the table (or its alias) can distinguish them well enough. You want queries to be quick to type and simple - prefixes add unnecessary complexity.
4) Whatever you want, I'd suggest CapitalCase
I don't think there's one set of absolute guidelines on any of these.
As long as whatever you pick is consistent across the application or DB I don't think it really matters.
In my opinion:
Table names should be plural.
Column names should be singular.
No.
Either CamelCase (my preferred) or underscore_separated for both table names and column names.
However, like it has been mentioned, any convention is better than no convention. No matter how you choose to do it, document it so that future modifications follow the same conventions.
I think the best answer to each of those questions would be given by you and your team. It's far more important to have a naming convention then how exactly the naming convention is.
As there's no right answer to that, you should take some time (but not too much) and choose your own conventions and - here's the important part - stick to it.
Of course it's good to seek some information about standards on that, which is what you're asking, but don't get anxious or worried about the number of different answers you might get: choose the one that seems better for you.
Just in case, here are my answers:
Yes. A table is a group of records, teachers or actors, so... plural.
Yes.
I don't use them.
The database I use more often - Firebird - keeps everything in upper case, so it doesn't matter. Anyway, when I'm programming I write the names in a way that it's easier to read, like releaseYear.
Definitely keep table names singular, person not people
Same here
No. I've seen some terrible prefixes, going so far as to state what were dealing with is a table (tbl_) or a user store procedure (usp_). This followed by the database name... Don't do it!
Yes. I tend to PascalCase all my table names
Naming conventions allow the development team to design discovereability and maintainability at the heart of the project.
A good naming convention takes time to evolve but once it’s in place it allows the team to move forward with a common language. A good naming convention grows organically with the project. A good naming convention easily copes with changes during the longest and most important phase of the software lifecycle - service management in production.
Here are my answers:
Yes, table names should be plural when they refer to a set of trades, securities, or counterparties for example.
Yes.
Yes. SQL tables are prefixed with tb_, views are prefixed vw_, stored procedures are prefixed usp_ and triggers are prefixed tg_ followed by the database name.
Column name should be lower case separated by underscore.
Naming is hard but in every organisation there is someone who can name things and in every software team there should be someone who takes responsibility for namings standards and ensures that naming issues like sec_id, sec_value and security_id get resolved early before they get baked into the project.
So what are the basic tenets of a good naming convention and standards: -
Use the language of your client and
your solution domain
Be descriptive
Be consistent
Disambiguate, reflect and refactor
Don’t use abbreviations unless they
are clear to everyone
Don’t use SQL reserved keywords as
column names
Here's a link that offers a few choices. I was searching for a simple spec I could follow rather than having to rely on a partially defined one.
http://justinsomnia.org/writings/naming_conventions.html
SELECT
UserID, FirstName, MiddleInitial, LastName
FROM Users
ORDER BY LastName
Table names should always be singular, because they represent a set of objects. As you say herd to designate a group of sheep, or flock do designate a group of birds. No need for plural. When a table name is composition of two names and naming convention is in plural it becomes hard to know if the plural name should be the first word or second word or both.
It’s the logic – Object.instance, not objects.instance. Or TableName.column, not TableNames.column(s).
Microsoft SQL is not case sensitive, it’s easier to read table names, if upper case letters are used, to separate table or column names when they are composed of two or more names.
Table Name: It should be singular, as it is a singular entity representing a real world object and not objects, which is singlular.
Column Name: It should be singular only then it conveys that it will hold an atomic value and will confirm to the normalization theory. If however, there are n number of same type of properties, then they should be suffixed with 1, 2, ..., n, etc.
Prefixing Tables / Columns: It is a huge topic, will discuss later.
Casing: It should be Camel case
My friend, Patrick Karcher, I request you to please not write anything which may be offensive to somebody, as you wrote, "•Further, foreign keys must be named consistently in different tables. It should be legal to beat up someone who does not do this.". I have never done this mistake my friend Patrick, but I am writing generally. What if they together plan to beat you for this? :)
Very late to the party but I still wanted to add my two cents about column prefixes
There seem to be two main arguments for using the table_column (or tableColumn) naming standard for columns, both based on the fact that the column name itself will be unique across your whole database:
1) You do not have to specify table names and/or column aliases in your queries all the time
2) You can easily search your whole code for the column name
I think both arguments are flawed. The solution for both problems without using prefixes is easy. Here's my proposal:
Always use the table name in your SQL. E.g., always use table.column instead of column.
It obviously solves 2) as you can now just search for table.column instead of table_column.
But I can hear you scream, how does it solve 1)? It was exactly about avoiding this. Yes, it was, but the solution was horribly flawed. Why? Well, the prefix solution boils down to:
To avoid having to specify table.column when there's ambiguity, you name all your columns table_column!
But this means you will from now on ALWAYS have to write the column name every time you specify a column. But if you have to do that anyways, what's the benefit over always explicitly writing table.column? Exactly, there is no benefit, it's the exact same number of characters to type.
edit: yes, I am aware that naming the columns with the prefix enforces the correct usage whereas my approach relies on the programmers
Essential Database Naming Conventions (and Style) (click here for more detailed description)
table names
choose short, unambiguous names, using no more than one or two words
distinguish tables easily
facilitates the naming of unique field names as well as lookup and linking tables
give tables singular names, never plural (update: i still agree with the reasons given for this convention, but most people really like plural table names, so i’ve softened my stance)... follow the link above please
Table names singular. Let's say you were modelling a realtionship between someone and their address.
For example, if you are reading a datamodel would you prefer
'each person may live at 0,1 or many address.' or
'each people may live at 0,1 or many addresses.'
I think its easier to pluralise address, rather than have to rephrase people as person. Plus collective nouns are quite often dissimlar to the singular version.
--Example SQL
CREATE TABLE D001_Students
(
StudentID INTEGER CONSTRAINT nnD001_STID NOT NULL,
ChristianName NVARCHAR(255) CONSTRAINT nnD001_CHNA NOT NULL,
Surname NVARCHAR(255) CONSTRAINT nnD001_SURN NOT NULL,
CONSTRAINT pkD001 PRIMARY KEY(StudentID)
);
CREATE INDEX idxD001_STID on D001_Students;
CREATE TABLE D002_Classes
(
ClassID INTEGER CONSTRAINT nnD002_CLID NOT NULL,
StudentID INTEGER CONSTRAINT nnD002_STID NOT NULL,
ClassName NVARCHAR(255) CONSTRAINT nnD002_CLNA NOT NULL,
CONSTRAINT pkD001 PRIMARY KEY(ClassID, StudentID),
CONSTRAINT fkD001_STID FOREIGN KEY(StudentID)
REFERENCES D001_Students(StudentID)
);
CREATE INDEX idxD002_CLID on D002_Classes;
CREATE VIEW V001_StudentClasses
(
SELECT
D001.ChristianName,
D001.Surname,
D002.ClassName
FROM
D001_Students D001
INNER JOIN
D002_Classes D002
ON
D001.StudentID = D002.StudentID
);
These are the conventions I was taught, but you should adapt to whatever you developement hose uses.
Plural. It is a collection of entities.
Yes. The attribute is a representation of singular property of an entity.
Yes, prefix table name allows easily trackable naming of all constraints indexes and table aliases.
Pascal Case for table and column names, prefix + ALL caps for indexes and constraints.

Resources