Related
I have been looking for more than two hours, and i have found many articles about tables/columns naming and other tips, but any exact answer regarding database naming itself. Can you tell me please the best option? And are there some real cases when it makes sense?
clothing_store
ClothingStore
clothingStore
or maybe clothing-store
MySQL root user has two default databases named as the first version (information_schema, performance_schema, sys). So it means that the first is best?
Generally, go with underscores since that's easy to read in upper and lower case. It's also the most commonly used convention in most reference material and existing db's:
clothing_store
(And CLOTHING_STORE.)
Most, if not all, DB engines treat table names case-insensitively (even if some display them in their original case-sensitive name). So these two are the same:
ClothingStore
clothingStore
And so are "Clothingstore", "clothingstore", "CLotHinGstOre", etc.
Table names can't have a hypen, since that's an expression, like a - b:
or maybe clothing-store
You could just call it "store", unless you've got multiple tables of different stores.
here is a simple question to which I would like an answer to:
We have a member table. Each member practices one, many or no sports. Initially we (the developers) created a [member] table, a [sports] table and a [member_sports] table, just as we have always done.
However our client here doesn't like this and wants to store all the sports that the member practices in a single varchar column, separated with a special character.
So if:
1 is football
2 is tennis
3 is ping-pong
4 is swimming
and I like swimming and ping-pong, my favourite sports will be stored into the varchar column as:
x3,x4
Now we don't want to just walk up to the client and claim that his system isn't right. We would like to back it up with proof that the operation to fetch the sports from [member_sports] is more efficient than simply storing the fields as a varchar.
Is there any documentation that can back our claims? Help!
Ask your client if they care about storing accurate information1 rather than random strings.
Then set them a series of challenges. First, ensure that the sport information is in the correct "domain". For the member_sports table, that is:
sport_id int not null
^
|--correct type
For their "store everything in a varchar column" solution, I guess you're writing a CHECK constraint. A regex would probably help here but there's no native support for regex in SQL Server - so you're either bodging it or calling out to a CLR function to make sure that only actual int values are stored.
Next, we not only want to make sure that the domain is correct but that the sports are actually defined in your system. For member_sports, that's:
CONSTRAINT FK_Member_Sports_Sports FOREIGN KEY (Sport_ID) references Sports (Sport_ID)
For their "store everything in a varchar column" I guess this is going to be a far more complex CHECK constraint using UDFs to query other tables. It's going to be messy and procedural. Plus if you want to prevent a row from being removed from sports while it's still referenced by any member, you're talking about a trigger on the sports table that has to query every row in members2`.
Finally, let's say that it's meaningless for the same sport to be recorded for a single member multiple times. For member_sports, that is (if it's not the PK):
CONSTRAINT UQ_Member_Sports UNIQUE (Member_ID,Sport_ID)
For their "store everything in a varchar column" it's another horrifically procedural UDF called from a CHECK constraint.
Even if the varchar variant performed better (unlikely since you need to be ripping strings apart and T-SQL's string manipulation functions are notoriously weak (see above re: regex)) for certain values of "performs better", how do they propose that the data is meaningful and not nonsense?
Writing the procedural variants that can also cope with nonsense is an even more challenging endeavour.
In case it's not clear from the above - I am a big fan of Declarative Referential Integrity (DRI). Stating what you want versus focussing on mechanisms is a huge part of why SQL appeals to me. You construct the right DRI and know that your data is always correct (or, at least, as you expect it to be)
1"The application will always do this correctly" isn't a good answer. If you manage to build an application and related database in which nobody ever writes some direct SQL to fix something, I guess you'll be the first.
But in most circumstances, there's always more than one application, and even if the other application is a direct SQL client only employed by developers, you're already beyond being able to trust that the application will always act correctly. And bugs in applications are far more likely than bugs in SQL database engine's implementations of constraints, which have been tested far more times than any individual application's attempt to enforce constraints.
2Let alone the far more likely query - find all members who are associated with a particular sport. A second index on member_sports makes this a trivial query3. No indexes help the "it's somewhere in this string" solution and you're looking at a table scan with no indexing opportunities.
3Any index that has sport_id first should be able to satisfy such a query.
I have a table
Item(ItemName*, ItemSize*, Price, Notes)
I was making composite key of (ItemName,ItemSize) to uniquely identify item. And now after reading some answers on stackoverflow suggesting the use of UNIQUE i revised it as
Item(ItemID*, ItemName, ItemSize, Price, Notes)
But How to apply UNIQUE constraint on ItemName and ItemSize
please correct if there is something wrong in question
ALTER TABLE Items ADD UNIQUE INDEX(ItemName, ItemSize);
and here's an article explaining how to achieve the same using SQL Server Management Studio.
ALTER TABLE Items ADD CONSTRAINT uc_name_size UNIQUE (ItemName,ItemSize)
reference from oracle and postgres doc
You are getting hung up on One Tool to do a task. WIthout understanding that:
as Darin states, at the end of the day, SQL is a character-based language
any and all commands (Data Manipualtion, or Data Definition as in this case), are executed on the SQL server as character strings
there are over one hundred GUI SQL Server Administration tools, from the rubbish that MS keeps churning out every other year, to mature products that keep growing (not replaced or rewritten)
you can click or drag or whatever in whatever GUI you use, but when you hit the "save" or "apply" button, the all do the same thing: send an SQL character string to the SQL server for processing
Therefore yes, you do need to understand what is happening at the SQL command level if you are going to either adinister the server or model/implement databases. Otherwise you will do unintended things when you click or drag.
SQL has been around for over 30 years, and it has come a long way (it is still very limited, but that is not relevant here). In the old days, we only had
{DROP|CREATE} [UNIQUE] [CLUSTERED] INDEX name ON table (columns, ...)
syntax. As it expanded, more Relational constructs were added, and we have the
ALTER TABLE table {ADD|DROP} {UNIQUE|PRIMARY KEY} CONSTRAINT name (columns,...)
syntax.
Dave Pinal is correct to a point: in terms of data storage structures inside the server ,both INDEX and CONSTRAINT syntaxes result in the same thing, an index.
But he is just answering a question, and obviously has not heard about the ISO/IEC/ANSI Standard SQL characterististics that are implied in the newer CONSTRAINT syntax, which are not implied in the INDEX syntax (if you use it, you have to specify those parameters explicitly). More important, there are many parameters that can be supplied in the INDEX syntax which are absent in the CONSTRAINT syntax. So there are significant differences, which may not be relevant to small servers running in a default state.
Generally people who are inclined towards performance at the physical level, or who have hundreds of tables to administer, use the INDEX syntax; amd people who distance themselves from the physical us the CONSTRAINT.
The point is, use one XOR the other, do not use a combination: that leads to creating duplicate indices that you are not even aware of (because they do not show up in the broken MS GUI panel that you are looking at).
.
Dave confuses things: there is no such thing as a "Primary Key index". It is either a Primary Key Constraint or a Index (which may be the Primary Key, and have setting relevant to supporting a PK)..
The next thing that will confuse anyone, beginner or otherwise, is that you are used to seeing all kinds of funny drawings that are supposed to depict dat or data models, and they do not. MS is the worst offender, in each different product, there is quite a different funny diagram and set of symbols. There is no commonality or standard; there are symbols that depict importaant aspect of the design, in one picture that you may want in another picture, and you can't get it.
Well, actually there is a Relational Database Modelling Standard, called IDEF1X. But MS have not heard of it. The idea is with a standard, all teh important information regarding the model; the subtleties; etc, are identified in the single model. Many different teams can use the single model. And of course it has its standard set of symbols and notation.
Point is, learn the standards, it will clear up a lot of confusion for you. Then, regardless of what GUI tool you have on your screen today, you will know what you clearly want/have in your data model, and what is going on inside the server.
Point is, re "how do I do this graphically", you do it in any diagramming tool, because you own the model, and you set choose the settings on the tables. No MS GUI has ever, or will ever give that to you.
.
The GUI is not a substitute for knowledge.
Could you explain: if as you state, (ItemName, ItemSize) forms an unique key on that particular table, on what basis do you think you need instead (ItemName, ItemSize, plus anything more )? How can you get more unique than unique ?
I've been reading a couple of questions/answers on StackOverflow trying to find the 'best', or should I say must accepted way, to name tables on a Database.
Most of the developers tend to name the tables depending on the language that requires the database (JAVA, .NET, PHP, etc). However I just feel this isn't right.
The way I've been naming tables till now is doing something like:
doctorsMain
doctorsProfiles
doctorsPatients
patientsMain
patientsProfiles
patientsAntecedents
The things I'm concerned are:
Legibility
Quick identifying of the module the table is from (doctors||patients)
Easy to understand, to prevent confusions.
I would like to read any opinions regarding naming conventions.
Thank you.
Being consistent is far more important than what particular scheme you use.
I typically use PascalCase and the entities are singular:
DoctorMain
DoctorProfile
DoctorPatient
It mimics the naming conventions for classes in my application keeping everything pretty neat, clean, consistent, and easy to understand for everybody.
Since the question is not specific to a particular platform or DB engine, I must say for maximum portability, you should always use lowercase table names.
/[a-z_][a-z0-9_]*/ is really the only pattern of names that seamlessly translates between different platforms. Lowercase alpha-numeric+underscore will always work consistently.
As mentioned elsewhere, relation (table) names should be singular: http://www.teamten.com/lawrence/programming/use-singular-nouns-for-database-table-names.html
Case insensitive nature of SQL supports Underscores_Scheme. Modern software however supports any kind of naming scheme. However sometimes some nasty bugs, errors or human factor can lead to UPPERCASINGEVERYTHING so that those, who selected both Pascal_Case and Underscore_Case scheme live with all their nerves in good place.
An aggregation of most of the above:
don't rely on case in the database
don't consider the case or separator part of the name - just the words
do use whatever separator or case is the standard for your language
Then you can easily translate (even automatically) names between environments.
But I'd add another consideration: you may find that there are other factors when you move from a class in your app to a table in your database: the database object has views, triggers, stored procs, indexes, constraints, etc - that also need names. So for example, you may find yourself only accessing tables via views that are typically just a simple "select * from foo". These may be identified as the table name with just a suffix of '_v' or you could put them in a different schema. The purpose for such a simple abstraction layer is that it can be expanded when necessary to allow changes in one environment to avoid impacting the other. This wouldn't break the above naming suggestions - just a few more things to account for.
I use underscores. I did an Oracle project some years ago, and it seemed that Oracle forced all my object names to upper case, which kind of blows any casing scheme. I am not really an Oracle guy, so maybe there was a way around this that I wasn't aware of, but it made me use underscores and I have never gone back.
I tend to agree with the people who say it depends on the conventions of language you're using (e.g. PascalCase for C# and snake_case for Ruby).
Never camelCase, though.
After reading a lot of other opinions I think it's very important to use the naming conventions of the language, consistency is more important than naming conventions only if you're (and will be) the only developer of the application. If you want readability (which is of huge importance) you better use the naming conventions for each language. In MySQL for example, I don't suggest using CamelCase since not all platforms are case sensitive. So here underscore goes better.
These are my five cents. I came to conclusion that if DBs from different vendors are used for one project there are two best ways:
Use underscores.
Use camel case with quotes.
The reason is that some database will convert all characters to uppercase and some to lowercase. So, if you have myTable it will become MYTABLE or mytable when you will work with DB.
Naming conventions exist within the scope of a language, and different languages have different naming conventions.
SQL is case-insensitive by default; so, snake_case is a widely used convention. SQL also supports delimited identifiers; so, mixed case in an option, like camelCase (Java, where fields == columns) or PascalCase (C#, where tables == classes and columns == fields). If your DB engine can't support the SQL standard, that's its problem. You can decide to live with that or choose another engine. (And why C# just had to be different is a point of aggravation for those of us who code in both.)
If you intend to ever only use one language in your services and applications, use the conventions of that language at all layers. Else, use the most widely used conventions of the language in the domain where that language is used.
C# approach
Singular/Plural
singular if your record in row contains just 1 value.
If it is array then go for plural. It would make perfect sense also when you foreach such element. E.g. your array column contains MostVisitedLocations: London, NewYork, Bratislava
then:
foreach(var mostVisitedLocation in MostVisitedLocations){
//go through each array element
}
Casing
PascalCase for table names and camelCase for columns made the best sense to me. But in my case in .NET 5 when I had json objects saved in dbs with json object names in camelCase, System.Text.Json wasnt able to deserialise it to object. Because your model has to be public and public properties are PascalCase. So mapping table columns(camelCase) and json object names(camelCase) to these properties can result in error(because mapping is case sensitive).
Btw with NeftonsoftJson this problem is not present.
So I ended app with:
Tables: App.Admin, App.Pricing, UserData.Account
Columns: Id, Price, IsOnline.
2 suggestions based on use cases:
Singular table names.
Although I used to believe in pluralizing table names once, I found in practise that there is little to no benefit to it other than the human mind to think in terms of tables as collections.
When singularising the table names, you can silently add -table to the singular table name in your head, and then it all makes sense again.
SELECT username FROM UserTable
Sounds more natural than
SELECT username FROM UsersTable
But post-fixing every table with is just a waste.
The actual practical argumentation for singularising table names:
What is the plural of person: persons or people?
This is still ok.
But how do you like a table with postfix -status? Statuses?
That sucks, sorry.
It is easy to inadvertently make a human mistake by singularizing the status table, but pluralizing the other tables.
PascalCasing + Underscore convention.
Given table User, Role and a many-to-many table User_Role.
Considering underscore cased user_role is dubious when all table names are using underscore per default.
Is user_role a table that contains user roles? In this case it is not, it is a join table.
When deciding on table name conventions I think it is useful to let go of personal preference and take into account the real practical considerations of real life problems in order to minimize dubious situations to occur.
As the many answers and opinions have indicated, whatever your personal opinion is, different people think differently, and you will not be the only person working on the database despite being the one who sets it up (unless you do, in which case you're only helping yourself).
Therefore it is useful to have practical argumentation (practical in the sense of, does it help my future co-workers to avoid dubious situations) when your past decision is being questioned.
Unfortunately there is no "best" answer to this question. As #David stated consistency is far more important than the naming convention.
there's wide variability on how to separate words, so there you'll have to pick whatever you like better; but at the same time, it seems there's near consensus that the table name should be singular.
We are going to develop a new system over a legacy database(using .NET C#).There are approximately 500 tables. We have choosen to use an ORM tool for our data access layer. The problem is with namig convention for the entities.
The tables in the database have names like TB_CUST - table with customer data, or TP_COMP_CARS - company cars. The first letter of prefix defines the modul and the second letter defines its relations to other tables.
I would like to name the entities more meaningful. Like TB_CUST just Customer or CustomerEntity. Of course there would be a comment pointing to its table name.
But the DBA and programer in one person, dont want names like this. He wants to have the entities names exactly the same to the table names. He is saying that he would have to remember two names and it would be difficult and messy. I have to say his not really familiar with the principles of OOP.
But in case of an entity name like TP_COMP_CARS there should be methods names like Get TP_COMP_CARS or SaveTP_COMP_CARS..I think this is unreadable and ugly.
So please tell me your opinion. Who is right and why.
Thank you in advance
The whole idea of ORM tools, is to avoid the need of remembering database objects.
We usually create a database class with all the table and column names, so no one needs to remember anything, and the ORM should map database "names" to normal entities.
Although it is subjective, in my opinion you are right and he is wrong....
Who is going to work mostly with the new code? That person should decide the naming convention IMHO.
Personally of course I would go for your solution because as has already been mentioned, if you use ORM you don't need to hit the DB directly often.
As a compromise you could use names like TB_CUST where act directly with the database but then use names like Customer for your Data Transfer Objects. Writing good code involves creating an abstraction of any datasources you might be working with. Have GetTB_CUST() throughout your code is a little like having GetTB_CUSTFromThatSQLDatabaseWeHave() dotted around the place.
I personally hate table names like that, but it's a legacy system and I'm sure the DBA doesn't feel like renaming the tables. Renaming the tables could be an option. You would just have to create views representing the old table names so that your legacy system keeps running while you develop your new system. If this is not an option you can use the ORM to map table names to entity names. Or you can abstract your ORM away in a data access layer and define nice entity names in your domain model, having your DAL do the name conversion.
The naming conventions used in two different domains simply don't align. Java, for example, hasa very well defined rules/conventions for Class names and field names, where capitalisation is significant. In general, your application may be ported to a completely different Database with a different naming standard, it's not reasonable to demand alignment of names in Business Logic with names in Database. Consider a slightly more complex mapping, one Entity may not correspond to one Table.
And, really, come on ...
Customer == TB_CUST
That is just not so hard! I'm with you, makes the names meaningful in the code and map in the ORM. The learning for the DBA/Programmer should not be that painful, my guess is that it's one of those things that feels much worse in the anticipation than the reality.
If there are 500 tables in the database - you've already got a challenge keeping those names straight. Hopefully, you've got metadata and some graphical models that describe them more meaningfully.
When you create the next 500 ORM objects - you'll have another challenge. Even if you give them meaningful names it's still too many to really hope that all will be obvious. So, now you've got 2 problems.
If there's no way to link those two sets of 500 tables together - then you've got 3 problems. Think about debugging performance of queries in the ORM, and going to the DBA with names that he doesn't recognize. Think about your carefully crafted names - that then must be ignored when you create reports that hit the database directly.
So, I'd try very hard to use the database names in the ORM. But I would tweak a few things:
if a name is too cryptic to understand - I'd work with the DBA to improve its name. An easy way to transition to better names is through views. Ideally you get rid of the original confusing name eventually tho.
changing from underscores to camelcase, etc shouldn't be considered a change to the name - it's just a change to the separator. So, use the appropriate name for your language.
the database prefixes can probably be dropped - they're actually just attributes of the table name that have been "denormalized" and grafted onto the name. They may be necessary to avoid duplication if they indicate a subsection of the model, but in general may be be as important.
"I have to say his not really familiar with the principles of OOP.
But in case of an entity name like TP_COMP_CARS there should be methods names like Get TP_COMP_CARS or SaveTP_COMP_CARS..I think this is unreadable and ugly.
So please tell me your opinion. Who is right and why."
Which names are given to the things your IT systems manages has absolutely nothing to do with "the principles of OOP".
The same holds for which names are given to "standard" "getter and setter" methods : those are just agreements and conventions, not "principles of OOP".
The issue is a certain kind of "ergonomics" (and thus also the self-documenting value) of the code.
It is true that getTP_COMP_CARS looks ugly (though not, as you claim, "unreadable"). It is also true that if you start adhering to "more modern" naming conventions, then there will have to be someone somewhere who will have to maintain a mapping between the names that are synonymous. (And it is untrue that names such as TP_COMP_CARS are less self-documenting than full "natural-language-words" names : usually such names are constructed from a VERY SMALL set of mnemonic words that are used over and over again with the same meaning, making it more than easy enough for anyone to remember them.)
There is no right and wrong about this. Names like that were the usual convention in the days before the ones we live in now. At least, those names usually had the benefit of being case-insensitive, as opposed to the braindead (because case-sensitive) naming rules that are imposed upon us by so-called "more modern" systems.
Twenty years from now, people will call the naming conventions we use these days "braindead" too.