Where can I find a detailed manual about PostgreSQL naming conventions? (table names vs. camel case, sequences, primary keys, constraints, indexes, etc...)
Regarding tables names, case, etc, the prevalent convention is:
SQL keywords: UPPER CASE
identifiers (names of databases, tables, columns, etc): lower_case_with_underscores
For example:
UPDATE my_table SET name = 5;
This is not written in stone, but the bit about identifiers in lower case is highly recommended, IMO. Postgresql treats identifiers case insensitively when not quoted (it actually folds them to lowercase internally), and case sensitively when quoted; many people are not aware of this idiosyncrasy. Using always lowercase you are safe. Anyway, it's acceptable to use camelCase or PascalCase (or UPPER_CASE), as long as you are consistent: either quote identifiers always or never (and this includes the schema creation!).
I am not aware of many more conventions or style guides. Surrogate keys are normally made from a sequence (usually with the serial macro), it would be convenient to stick to that naming for those sequences if you create them by hand (tablename_colname_seq).
See also some discussion here, here and (for general SQL) here, all with several related links.
Note: Postgresql 10 introduced identity columns as an SQL-compliant replacement for serial.
There isn't really a formal manual, because there's no single style or standard.
So long as you understand the rules of identifier naming you can use whatever you like.
In practice, I find it easier to use lower_case_underscore_separated_identifiers because it isn't necessary to "Double Quote" them everywhere to preserve case, spaces, etc.
If you wanted to name your tables and functions "#MyAṕṕ! ""betty"" Shard$42" you'd be free to do that, though it'd be pain to type everywhere.
The main things to understand are:
Unless double-quoted, identifiers are case-folded to lower-case, so MyTable, MYTABLE and mytable are all the same thing, but "MYTABLE" and "MyTable" are different;
Unless double-quoted:
SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($).
You must double-quote keywords if you wish to use them as identifiers.
In practice I strongly recommend that you do not use keywords as identifiers. At least avoid reserved words. Just because you can name a table "with" doesn't mean you should.
The only two answers here are 6 years old idk if snake_case being the best case is true anymore. Here's my take on modern times. Also, forgoing any extra complication of needing to double-quote. I think flow is more important than trying to avoid a minor inconvenience.
Provided by the fact that there are no strict guidelines/style guides, I'd say it is best to use the same case as project code. So for example, using OOP approach in languages like JavaScript, table names would be in PascalCase where as attributes would be in camelCase. Where as if you're taking the functional approach, they'd both be camelCase. Also, by convention JS classes are PascalCase and attributes are camelCase so it makes sense anyways.
On the other hand, if you are coding in Python using SqlAlchemy then it only makes sense to use snake_case names for function-derived models and PascalCase names for class-derived models. In both cases, attributes/columns should be snake_case.
Related
I have been looking for more than two hours, and i have found many articles about tables/columns naming and other tips, but any exact answer regarding database naming itself. Can you tell me please the best option? And are there some real cases when it makes sense?
clothing_store
ClothingStore
clothingStore
or maybe clothing-store
MySQL root user has two default databases named as the first version (information_schema, performance_schema, sys). So it means that the first is best?
Generally, go with underscores since that's easy to read in upper and lower case. It's also the most commonly used convention in most reference material and existing db's:
clothing_store
(And CLOTHING_STORE.)
Most, if not all, DB engines treat table names case-insensitively (even if some display them in their original case-sensitive name). So these two are the same:
ClothingStore
clothingStore
And so are "Clothingstore", "clothingstore", "CLotHinGstOre", etc.
Table names can't have a hypen, since that's an expression, like a - b:
or maybe clothing-store
You could just call it "store", unless you've got multiple tables of different stores.
When an ontology is created from text consisting of a set of sentences, it can be useful to bind any given concept with all the sentences, where it is present. But that inevitably leads to a nasty duplication of sentences, when the usual Annotation is used for storing the related text.
E.g. the sentence "Attributive language is the base language which allows: Atomic negation (negation of concept names that do not appear on the left hand side of axioms), Concept intersection, Universal restrictions, Limited existential quantification." would need to be copied as an Annotation to the Entities: Attributive language, Language, Atomic negation, Negation, Concept names, Axiom, Concept intersection, Universal restriction, Limited existencial quantification.
What is in your opinion a good way to avoid copying the same sentence to several locations and yet to have traces from the Entity to the relevant sentences?
I would create a named individual with an IRI and attach the sentence to it, then add a relationship from the concepts to the individual.
The individual might have a type, e.g., Sentence, but this is not necessary. Properties used can be annotation properties or data/object properties.
It's clear that you can use numeric characters in SQL table names and use them so long as they're not at the beginning. (There's a discussion here on one of the side effects: SQLite issue with Table Names using numbers?)
The database I'm targetting is Oracle 10g/11g.
I'm designing a reporting database where naming some of the entities clearly is best done by describing the reports, which are named after numbers ('part 45', '102S', '401'). It's just the business domain language: these reports just aren't commonly referred to by any other name. The entities I'm modelling really are best named this way.
My question is: am I going to have difficulties with maintenance or programmability if I put numbers in a table name? I'm always worried about ancillary software around the database: drivers, ETL code that might not play nice with a non-plain-vanilla name. But there's a real benefit in intelligibility in this business domain, so am I just being squeamish?
My question put simply is: are there any 'gotchas' or corner cases that would rule out a table name like PART_45_AUDIT?
If PART_45_AUDIT is really the clearest description of the entity you're modeling (which would be very rare), there shouldn't be any gotchas to having numbers in the middle of a name. Putting numbers at the front of the name would be a different story because that would require using double-quoted identifiers and there are plenty of tools that don't fully support double-quoted identifiers. Plus, of course, it's rather annoying to have to type the double-quotes every time you reference the table.
CREATE TABLE "102S" (
col1 number
);
SELECT *
FROM "102S"
I've been reading a couple of questions/answers on StackOverflow trying to find the 'best', or should I say must accepted way, to name tables on a Database.
Most of the developers tend to name the tables depending on the language that requires the database (JAVA, .NET, PHP, etc). However I just feel this isn't right.
The way I've been naming tables till now is doing something like:
doctorsMain
doctorsProfiles
doctorsPatients
patientsMain
patientsProfiles
patientsAntecedents
The things I'm concerned are:
Legibility
Quick identifying of the module the table is from (doctors||patients)
Easy to understand, to prevent confusions.
I would like to read any opinions regarding naming conventions.
Thank you.
Being consistent is far more important than what particular scheme you use.
I typically use PascalCase and the entities are singular:
DoctorMain
DoctorProfile
DoctorPatient
It mimics the naming conventions for classes in my application keeping everything pretty neat, clean, consistent, and easy to understand for everybody.
Since the question is not specific to a particular platform or DB engine, I must say for maximum portability, you should always use lowercase table names.
/[a-z_][a-z0-9_]*/ is really the only pattern of names that seamlessly translates between different platforms. Lowercase alpha-numeric+underscore will always work consistently.
As mentioned elsewhere, relation (table) names should be singular: http://www.teamten.com/lawrence/programming/use-singular-nouns-for-database-table-names.html
Case insensitive nature of SQL supports Underscores_Scheme. Modern software however supports any kind of naming scheme. However sometimes some nasty bugs, errors or human factor can lead to UPPERCASINGEVERYTHING so that those, who selected both Pascal_Case and Underscore_Case scheme live with all their nerves in good place.
An aggregation of most of the above:
don't rely on case in the database
don't consider the case or separator part of the name - just the words
do use whatever separator or case is the standard for your language
Then you can easily translate (even automatically) names between environments.
But I'd add another consideration: you may find that there are other factors when you move from a class in your app to a table in your database: the database object has views, triggers, stored procs, indexes, constraints, etc - that also need names. So for example, you may find yourself only accessing tables via views that are typically just a simple "select * from foo". These may be identified as the table name with just a suffix of '_v' or you could put them in a different schema. The purpose for such a simple abstraction layer is that it can be expanded when necessary to allow changes in one environment to avoid impacting the other. This wouldn't break the above naming suggestions - just a few more things to account for.
I use underscores. I did an Oracle project some years ago, and it seemed that Oracle forced all my object names to upper case, which kind of blows any casing scheme. I am not really an Oracle guy, so maybe there was a way around this that I wasn't aware of, but it made me use underscores and I have never gone back.
I tend to agree with the people who say it depends on the conventions of language you're using (e.g. PascalCase for C# and snake_case for Ruby).
Never camelCase, though.
After reading a lot of other opinions I think it's very important to use the naming conventions of the language, consistency is more important than naming conventions only if you're (and will be) the only developer of the application. If you want readability (which is of huge importance) you better use the naming conventions for each language. In MySQL for example, I don't suggest using CamelCase since not all platforms are case sensitive. So here underscore goes better.
These are my five cents. I came to conclusion that if DBs from different vendors are used for one project there are two best ways:
Use underscores.
Use camel case with quotes.
The reason is that some database will convert all characters to uppercase and some to lowercase. So, if you have myTable it will become MYTABLE or mytable when you will work with DB.
Naming conventions exist within the scope of a language, and different languages have different naming conventions.
SQL is case-insensitive by default; so, snake_case is a widely used convention. SQL also supports delimited identifiers; so, mixed case in an option, like camelCase (Java, where fields == columns) or PascalCase (C#, where tables == classes and columns == fields). If your DB engine can't support the SQL standard, that's its problem. You can decide to live with that or choose another engine. (And why C# just had to be different is a point of aggravation for those of us who code in both.)
If you intend to ever only use one language in your services and applications, use the conventions of that language at all layers. Else, use the most widely used conventions of the language in the domain where that language is used.
C# approach
Singular/Plural
singular if your record in row contains just 1 value.
If it is array then go for plural. It would make perfect sense also when you foreach such element. E.g. your array column contains MostVisitedLocations: London, NewYork, Bratislava
then:
foreach(var mostVisitedLocation in MostVisitedLocations){
//go through each array element
}
Casing
PascalCase for table names and camelCase for columns made the best sense to me. But in my case in .NET 5 when I had json objects saved in dbs with json object names in camelCase, System.Text.Json wasnt able to deserialise it to object. Because your model has to be public and public properties are PascalCase. So mapping table columns(camelCase) and json object names(camelCase) to these properties can result in error(because mapping is case sensitive).
Btw with NeftonsoftJson this problem is not present.
So I ended app with:
Tables: App.Admin, App.Pricing, UserData.Account
Columns: Id, Price, IsOnline.
2 suggestions based on use cases:
Singular table names.
Although I used to believe in pluralizing table names once, I found in practise that there is little to no benefit to it other than the human mind to think in terms of tables as collections.
When singularising the table names, you can silently add -table to the singular table name in your head, and then it all makes sense again.
SELECT username FROM UserTable
Sounds more natural than
SELECT username FROM UsersTable
But post-fixing every table with is just a waste.
The actual practical argumentation for singularising table names:
What is the plural of person: persons or people?
This is still ok.
But how do you like a table with postfix -status? Statuses?
That sucks, sorry.
It is easy to inadvertently make a human mistake by singularizing the status table, but pluralizing the other tables.
PascalCasing + Underscore convention.
Given table User, Role and a many-to-many table User_Role.
Considering underscore cased user_role is dubious when all table names are using underscore per default.
Is user_role a table that contains user roles? In this case it is not, it is a join table.
When deciding on table name conventions I think it is useful to let go of personal preference and take into account the real practical considerations of real life problems in order to minimize dubious situations to occur.
As the many answers and opinions have indicated, whatever your personal opinion is, different people think differently, and you will not be the only person working on the database despite being the one who sets it up (unless you do, in which case you're only helping yourself).
Therefore it is useful to have practical argumentation (practical in the sense of, does it help my future co-workers to avoid dubious situations) when your past decision is being questioned.
Unfortunately there is no "best" answer to this question. As #David stated consistency is far more important than the naming convention.
there's wide variability on how to separate words, so there you'll have to pick whatever you like better; but at the same time, it seems there's near consensus that the table name should be singular.
I wonder if it's a problem, if a table or column name contains upper case letters. Something lets me believe databases have less trouble when everything is kept lower case. Is that true? Which databases don't like any upper case symbol in table and column names?
I need to know, because my framework auto-generates the relational model from an ER-model.
(this question is not about whether it's good or bad style, but only about if it's a technical problem for any database)
As far as I know there is no problem using either uppercase and lowercase. One reason for the using lower case convention is so that queries are more readable with lowercase table and column names and upper case sql keywords:
SELECT column_a, column_b FROM table_name WHERE column_a = 'test'
It is not a technical problem for the database to have uppercase letters in your table or column names, for any DB engine that I'm aware of. Keep in mind many DB implementations use case sensitive names, so always refer to tables and columns using the same case with which they were created (I am speaking very generally since you didn't specify a particular implementation).
For MySQL, here is some interesting information about how it handles identifier case. There are some options you can set to determine how they are stored internally. http://dev.mysql.com/doc/refman/5.0/en/identifier-case-sensitivity.html
The SQL-92 standard specifies that identifiers and keywords are case-insensitive (per A Guide to the SQL Standard 4th edition, Date / Darwen)
That's not to say that a particular DBMS isn't either (1) broken, or (2) configurable (and broken)
From a programming style perspective, I suggest using different cases for keywords and identifiers. Personally, I like uppercase identifiers and lowercase keywords, because it highlights the data that you're manipulating.
SQL standard requires names stored in uppercase
The SQL standard requires identifiers be stored in all-uppercase. See section 5.2.13 of the SQL-92 as quoted from a draft copy in this Answer on another Question. The standard allows you use undelimited identifiers in lowercase or mixed case, as the SQL processor is required to convert as needed to convert to the uppercase version.
This requirement presumably dates back to the early days of SQL when mainframe systems were limited to uppercase English characters only.
Non-issue
Many database ignore this requirement by the standard.
For example, Postgres does just the opposite, converting all unquoted (“undelimited”) identifiers to lowercase — this despite Postgres otherwise hewing closer to the standard than any other system I know of.
Some databases may store the identifier in the case you specified.
Generally this is a non-issue. Virtually all databases do a case-insensitive lookup from the case used by an identifier to the case stored by the database.
There are occasional oddball cases where you may need to specify an identifier in its stored case or you may need to specify all-uppercase. This may happen with certain utilities where you must pass an identifier as a string outside the usual SQL processor context. Rare, but tuck this away in the back of your head in case you encounter some mysterious "cannot find table" kind of error message someday when using some unusual tool/utility. Has happened to me once.
Snake case
Common practice nowadays seems to be to use all lowercase with underscore separating words. This style is known as Snake case.
The use of underscore rather than Camel case helps if your identifiers are ever presented as all uppercase (or all lowercase) and thereby lose readability without the word separation.
Bonus Tip: The SQL standard (SQL-92 section 5.2.11) explicitly promises to never use a trailing underscore in a keyword. So append a trailing underscore to all your identifiers to eliminate all worry of accidentally colliding.
As far as I know for a common L.A.M.P. setup it won't really matter - but be aware that MySQL hosted on Linux is case sensitive!
To keep my code tidy I usually stick to lower case names for tables and colums, uppercase MySQL-Code and mixed Upper-Lower-Case variables - like this:
SELECT * FROM my_table WHERE id = '$myNewID'
I use pascal case for field names lower case for table names (usually) as follows:
students
--------
ID
FirstName
LastName
Email
HomeAddress
courses
-------
ID
Name
Code
[etc]
Why is this cool? because it's readable, and because I can parse it as:
echo preg_replace('/([a-z])([A-Z])/','$1 $2',$field); //insert a space
NOW, here's the fun part for tables:
StudentsCourses
--------------
Students_ID
Courses_ID
AcademicYear
Semester
notice I capitalized S and C? That way they point back to the primary table(s). You could even write a routine to logically parse db structure this way and build queries automatically. So I use caps in tables when they are JOIN tables as in this case.
Similarly, think of the _ as a -> in this table as: Students->ID and Courses->ID
Not student_id - instead Students_ID - the cognate of the field matches the exact name of the table.
Using these simple conventions produces a readable protocol which handles about 70% of your typical relational structure.
If you're using postgresql and PHP, for instance, you'd have to write your query like this:
$sql = "SELECT somecolumn FROM \"MyMixedCaseTable\" where somerow= '$somevar'";
"Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other. (The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard. If you want to write portable applications you are advised to always quote a particular name or never quote it.)"
http://www.postgresql.org/docs/8.4/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
So, sometimes, it depends on what you are doing...
Whatever you use, keep in mind the MySQL on Linux is case sensitive, while on Windows it is case insensitive .
The column names which are mixed case or uppercase have to be double quoted in PostgreSQL. If you don't want to worry about it in the future, name it in the lower case.
MySQL - the columns are absolutely case insensitive. And it can lead to problems. Say someone has written "mynAme" instead of "myName". The system would work fine, but once some developer would go searching for it through the source code, they might overlook it, and you all get in trouble.
No modern database cannot handle upper or lower case text.
Think this is worth emphasizing: If a binary or case-sensitive collation is in effect, then (at least in Sql Server and other databases with rich collation features) identifiers and variable names WILL be case sensitive. You can even create tables whose names differ only in case. (—I am not sure the info above about the sql-92 standard is correct—if so, this part of the standard is not widely followed.)