Linking one table to exactly one of n tables - sql-server

I have a table called ResultIssue, and I want each of the rows of this table to link to exactly one row from one of many other tables, lets call them ResultSource1, ResultSource2 etc. ResultSources all contain different data so they can't be cleanly merged. They all generate the same type of Issue though, which is what the ResultIssue table is for.
How can I build my database to achieve this in a sensible way?
I realise I could just create a ResultIssue table for each source and Union all related rows together to generate my issue list, but I'm wondering if there is a more elegant solution.
This would be similar to having a column for each ResultSource in ResultIssue, and then creating a rule to ensure only one of them is >-1, and then querying this also becomes a nightmare.
I have also been considering a link table for each ResultSource, but I'm not sure how I could force the one result requirement.
I have also considered using an enum, but this feels like it violates some architectural principle, and I'm not sure how I'd query it. (i.e. have columns called LinkedTable and LinkedTableID)
I'm sure I'm not the only one to have this specific issue, so I'm hoping someone can point me at the solution.
Apologies in advance if this question has been asked already, I have been searching for a while, and I don't have the language to express this concisely.
Edit:
Adding a column to each of the ResultSource tables won't solve this either, as each source row can generate multiple issues. I also wouldn't be able to verify that multiple ResultSource tables don't link to the same ResultIssue

Your Sources are in an exclusive subtype relationship. Every Source is of exactly one subtype, where the subtypes are structurally different.
This broadly falls under patterns for "relational inheritance", which is the term you will want to google if you want more info. Also see:
How can you represent inheritance in a database?
Subclassing in relational database
How to do Inheritance Modeling in Relational Databases?
As you will see from those links, there are several patterns for transforming objects with subtypes into relational facts*. Choosing between them depends on your exact requirements, but here is one that is both extensible and well structured, and which I would typically consider my "default" choice (identity columns and clustering choices omitted since those are contingent).
create table Sources
(
sourceId int not null unique,
sourceType int not null,
constraint pk_soruces primary key (sourceId, sourceType),
constraint ck_sources_type check (sourceType in (1, 2, 3))
);
create table Type1Sources
(
sourceId int not null unique,
sourceType int not null constraint df_type1sources_type default (1),
constraint pk_type1sources primary key (sourceId, sourceType),
constraint fk_type1sources_sources foreign key (sourceId, sourceType) references Sources,
constraint ck_type1sources_areType1 check (sourceType = 1)
);
-- create table Type2Sources (...);
-- create tabel Type3Sources (...);
Notice that in Sources the sourceId is unique by itself, but in order to ensure two different subtypes can't refer to the same sourceId, we also add a sourceType attribute.
We then constrain each subtype table, such as Type1Sources, to exactly one sourceType attribute value with a check constraint, and create the foreign key to Sources on the combination {sourceId, sourceType}.
In my example I have a check constraint on Sources.sourceType that allows up to three subtype tables to be created. You can of course add more if needed.
Now that your Sources are represented, all we need to do is create an Issues table, and ensure that every Issue is associated with exactly one Source:
create table Issues
(
issueId int not null,
sourceId int not null,
sourceType int not null,
constraint pk_Issues primary key (issueId),
constraint fk_Issues_Sources foreign key (sourceId, sourceType) references Sources
);
Here I have included the sourceType column on the Issues table, but this is not logically necessary: you could always just join from Issues to Sources on sourceId alone, and get the sourceType that way. Recall that sourceId is unique on Sources, so we can create a foreign key from Issues to Sources on just the sourceId column if we want.
The reason why I have included sourceType on Issues in this example is that it lets me quickly (ie, without a join) determine the type of source that created the issue.
Whether you choose to do that or not really just depends on whether you want to store the common attributes for all of the different SourceTypes in the Sources table, or whether you want to replicate those common attributes to each of the subtype tables and have the Sources table acting merely as a constraint structure to ensure data integrity. In the former case you will always be joining Issues to Sources if you need those attributes, so there's less incentive to put the sourceType on the Issues table.
* Rows in a database don't represent "entities", they represent "true propositions about your universe of discourse", aka "facts you care about". Hence the "impedance mismatch" between object models and relational models.

Related

How to store a "primary" record

Suppose I have the following tables
Companies
--CompanyID
--CompanyName
and
Locations
--LocationID
--CompanyID
--LocationName
Every company has at least one location. I want to track a primary location for each company (and yes, every company will have exactly one primary location). What's the best way to set this up? Add a primaryLocationID in the Companies table?
Add a primaryLocationID in the Companies table?
Yes, however that creates a circular reference which could prevent you from inserting new data:
One way to resolve this chicken-and-egg problem is to simply leave Company.PrimaryLocationID NULL-able, so you can temporarily disable one of the circular FKs. This unfortunately means the database will enforce only "1:0..1", but not the strict "1:1" relationship (so you'll have to enforce it in the application code).
However, if your DBMS supports deferred constraints (such as Oracle or PostgreSQL), you can simply defer one of the FKs to break the cycle while the transaction is still in progress. By the end of the transaction both FKs have to be in place, resulting in a real "1:1" relationship.
The alternative solution is to have a flag in the Locations table that is set for a primary location, and NULL non-primary locations (note the U1, denoting a UNIQUE constraint, ensuring a company cannot have multiple primary locations):
CREATE TABLE Location (
LocationID INT PRIMARY KEY,
CompanyID INT NOT NULL, -- References Company table, not shown here.
LocationName VARCHAR(50) NOT NULL, -- Possibly UNIQUE?
IsPrimary INT CHECK (IsPrimary IS NULL OR IsPrimary = 1), -- Use a BIT or BOOLEAN if supported by your DBMS.
CONSTRAINT Locations_U1 UNIQUE (CompanyID, IsPrimary)
);
Unfortunately, this has some problems:
It can only guarantee up to "1:0..1" (but not the real "1:1") even on a DBMS that supports deferred constraints.
It requires an additional index (in support to the UNIQUE constraint). Every index brings certain overhead, mostly for INSERT/UPDATE/DELETE performance. Furthermore, secondary indexes in clustered tables contain copy of PK, which may make them "fatter" than expected.
It depends on ANSI-compliant composite UNIQUE constraints, that allow duplicated rows if any (but not necessarily all) of the fields are NULL. Unfortunately not all DBMSes follow the standard, so the above would not work out-of-box under Oracle or MS SQL Server (but would work under PostgreSQL and MySQL). You could use a filtered unique index instead of the UNIQUE constraint to work-around that, but not all DBMSes support that either.
The BaBL86's solution models M:N, while your requirement seems to be 1:N. Nonetheless, that model could be "coerced" into 1:N by placing a key on {LocationID} (and on {CompanyID, TypeOfLocation} to ensure there cannot be multiple locations of the same type for the same company), but is probably over-engineered for a simple "is primary" requirement.
I think your own solution is the best one - this ensures that every company can only have one primary location. By making it a NOT NULL column, you can even enforce that every company should have a primary location.
Using BaBL86's solution, you don't have those constraints: a company can have 0 - unlimited 'primary locations', which obviously shouldn't be possible.
Do note that, if you use foreign key constraints AND define primaryLocationID as a NOT NULL column, you'll run into problems, because you basically have a loop (Location points to Company, Company points to location). You cannot create a new Company (because it needs a primary location), nor can you create a new Location (because it needs a company).
I do it with pivot table:
CompanyLocations
--CompanyID
--LocationID
--TypeOfLocation (primary, office, warehouse etc.)
In this case you can select all locations and than use type as you like. If you create PrimaryLocationID - you're need two joins of one table and more complex logic. It's worst than this.

Supertype-subtype database design

I have a question about superype-subtype desing in a relational database. If I had a supertype with two subtype tables, I would have the PK of the supertype related to the PK of the two subtype tables as a FK. Let's say I had some thing like this:
Type
TypeID PK
SuperType
ID PK
TypeID FK
SubtypeA
ID PK,FK
SubtypeB
ID PK,FK
On the database side, how would I ensure that Supertype ID's of a given type were only put into the appropriate subtype table? For example, I would not want a Supertype ID with Type A to be put into the SubtypeB table. Is there a way to easily prevent this from happening on the database side? I know this could be handled in code, but what if the code had mistakes? Or what if someone manually entered the wrong ID into one of the Subtype tables? I guess I'm looking for some way to make this impossible on the database side.
Any ideas? Maybe the PK on the Supertype table should be the ID and TypeID combination with a unique constraint on the ID column to prevent a record with both types being in the SuperType table... and then the Subtype tables would have the combo ID and TypeID PK with a constraint on the TypeID to only be of the type it should be for the appropriate subtype table??
On the database side, how would I ensure that Supertype ID's of a given type were only put into the appropriate subtype table?
On a DBMS that supports deferred constraints, you could do something like this:
With the following constraint on SuperType:
CHECK (
(
(SubtypeAId IS NOT NULL AND SubtypeAId = SuperTypeId)
AND SubtypeBId IS NULL
)
OR
(
SubtypeAId IS NULL
AND (SubtypeBId IS NOT NULL AND SubtypeBId = SuperTypeId)
)
)
These peculiar circular FKs1 combined with the CHECK ensure both exclusivity and presence of the child (the CHECK ensures exactly one of: SuprerType.SubtypeAId, SuprerType.SubtypeBId is non-NULL and matches the SuperTypeId). Defer the child FKs (or the CHECK if your DBMS supports it) to break the chicken-and-egg problem when inserting new data.
1 SubtypeA.SubtypeAId references SuperType.SuperTypeId and SuperType.SubtypeAId references SubtypeA.SubtypeAId, ditto for the other subtype.
If your DBMS doesn't support deferred constraints, you could allow (in the CHECK) for both fields to be NULL and forgo the enforcement of the child's presence (you still keep the exclusivity).
Alternatively, just the exclusivity (but not presence) can also be enforced like this:
NOTE: You might need to add a redundant UNIQUE on SuperType {SuperTypeId, TypeId} if the DBMS doesn't support "out-of-key" FKs.
With the following constraint on SubtypeA:
CHECK(TypeId = 1)
And the following constraint on SubtypeB:
CHECK(TypeId = 2)
I used 1 and 2 to denote specific subtypes - you could use anything you like, as long as you are consistent.
Also, you could consider saving storage space by using calculated column for subtype's TypeId (such as Oracle 11 virtual columns).
BTW, enforcing presence and exclusivity through the application logic is not considered a bad overall strategy. Most of the time, you should strive to put as much integrity enforcement in the database as you can, but in this particular case doing it at the application level is often considered justified to avoid the complications above.
And finally, "all classes in separate tables" is not the only strategy for implementing inheritance. If you implement inheritance using "everything in one table" or "concrete classes in separate tables", enforcing both the presence and the exclusivity of subtypes becomes much easier.
Take a look at this post for more info.
Use a trigger to propagate the new entry in the supertype table to the appropriate subtype table.

Why does my database table need a primary key?

In my database I have a list of users with information about them, and I also have a feature which allows a user to add other users to a shortlist. My user information is stored in one table with a primary key of the user id, and I have another table for the shortlist. The shortlist table is designed so that it has two columns and is basically just a list of pairs of names. So to find the shortlist for a particular user you retrieve all names from the second column where the id in the first column is a particular value.
The issue is that according to many sources such as this Should each and every table have a primary key? you should have a primary key in every table of the database.
According to this source http://www.w3schools.com/sql/sql_primarykey.asp - a primary key in one which uniquely identifies an entry in a database. So my question is:
What is wrong with the table in my database? Why does it need a primary key?
How should I give it a primary key? Just create a new auto-incrementing column so that each entry has a unique id? There doesn't seem much point for this. Or would I somehow encapsulate the multiple entries that represent a shortlist into another entity in another table and link that in? I'm really confused.
If the rows are unique, you can have a two-column primary key, although maybe that's database dependent. Here's an example:
CREATE TABLE my_table
(
col_1 int NOT NULL,
col_2 varchar(255) NOT NULL,
CONSTRAINT pk_cols12 PRIMARY KEY (col_1,col_2)
)
If you already have the table, the example would be:
ALTER TABLE my_table
ADD CONSTRAINT pk_cols12 PRIMARY KEY (col_1,col_2)
Primary keys must identify each record uniquely and as it was mentioned before, primary keys can consist of multiple attributes (1 or more columns). First, I'd recommend making sure each record is really unique in your table. Secondly, as I understand you left the table without primary key and that's disallowed so yes, you will need to set the key for it.
In this particular case, there is no purpose in same pair of user IDs being stored more than once in the shortlist table. After all, that table models a set, and an element is either in the set or isn't. Having an element "twice" in the set makes no sense1. To prevent that, create a composite key, consisting of these two user ID fields.
Whether this composite key will also be primary, or you'll have another key (that would act as surrogate primary key) is another matter, but either way you'll need this composite key.
Please note that under databases that support clustering (aka. index-organized tables), PK is often also a clustering key, which may have significant repercussions on performance.
1 Unlike in mutiset.
A table with duplicate rows is not an adequate representation of a relation. It's a bag of rows, not a set of rows. If you let this happen, you'll eventually find that your counts will be off, your sums will be off, and your averages will be off. In short, you'll get confusing errors out of your data when you go to use it.
Declaring a primary key is a convenient way of preventing duplicate rows from getting into the database, even if one of the application programs makes a mistake. The index you obtain is a side effect.
Foreign key references to a single row in a table could be made by referencing any candidate key. However, it's much more convenient if you declare one of those candidate keys as a primary key, and then make all foreign key references refer to the primary key. It's just careful data management.
The one-to-one correspondence between entities in the real world and corresponding rows in the table for that entity is beyond the realm of the DBMS. It's up to your applications and even your data providers to maintain that correspondence by not inventing new rows for existing entities and not letting some new entities slip through the cracks.
Well since you are asking, it's good practice but in a few instances (no joins needed to the data) it may not be absolutely required. The biggest problem though is you never really know if requirements will change and so you really want one now so you aren't adding one to a 10m record table after the fact.....
In addition to a primary key (which can span multiple columns btw) I think it is good practice to have a secondary candidate key which is a single field. This makes joins easier.
First some theory. You may remember the definition of a function from HS or college algebra is that y = f(x) where f is a function if and only if for every x there is exactly one y. In this case, in relational math we would say that y is functionally dependent on x on this case.
The same is true of your data. Suppose we are storing check numbers, checking account numbers, and amounts. Assuming that we may have several checking accounts and that for each checking account duplicate check numbers are not allowed, then amount is functionally dependent on (account, check_number). In general you want to store data together which is functionally dependent on the same thing, with no transitive dependencies. A primary key will typically be the functional dependency you specify as the primary one. This then identifies the rest of the data in the row (because it is tied to that identifier). Think of this as the natural primary key. Where possible (i.e. not using MySQL) I like to declare the primary key to be the natural one, even if it spans across columns. This gets complicated sometimes where you may have multiple interchangeable candidate keys. For example, consider:
CREATE TABLE country (
id serial not null unique,
name text primary key,
short_name text not null unique
);
This table really could have any column be the primary key. All three are perfectly acceptable candidate keys. Suppose we have a country record (232, 'United States', 'US'). Each of these fields uniquely identifies the record so if we know one we can know the others. Each one could be defined as the primary key.
I also recommend having a second, artificial candidate key which is just a machine identifier used for linking for joins. In the above example country.id does this. This can be useful for linking other records to the country table.
An exception to needing a candidate key might be where duplicate records really are possible. For example, suppose we are tracking invoices. We may have a case where someone is invoiced independently for two items with one showing on each of two line items. These could be identical. In this case you probably want to add an artificial primary key because it allows you to join things to that record later. You might not have a need to do so now but you may in the future!
Create a composite primary key.
To read more about what a composite primary key is, visit
http://www.relationaldbdesign.com/relational-database-analysis/module2/concatenated-primary-keys.php

How to implement this data structure in SQL tables

I have a problem that can be summarized as follow:
Assume that I am implementing an employee database. For each person depends on his position, different fields should be filled. So for example if the employee is a software engineer, I have the following columns:
Name
Family
Language
Technology
CanDevelopWeb
And if the employee is a business manager I have the following columns:
Name
Family
FieldOfExpertise
MaximumContractValue
BonusRate
And if the employee is a salesperson then some other columns and so on.
How can I implement this in database schema?
One way that I thought is to have some related tables:
CoreTable:
Name
Family
Type
And if type is one then the employee is a software developer and hence the remaining information should be in table SoftwareDeveloper:
Language
Technology
CanDevelopWeb
For business Managers I have another table with columns:
FieldOfExpertise
MaximumContractValue
BonusRate
The problem with this structure is that I am not sure how to make relationship between tables, as one table has relationship with several tables on one column.
How to enforce relational integrity?
There are a few schools of thought here.
(1) store nullable columns in a single table and only populate the relevant ones (check constraints can enforce integrity here). Some people don't like this because they are afraid of NULLs.
(2) your multi-table design where each type gets its own table. Tougher to enforce with DRI but probably trivial with application or trigger logic.
The only problem with either of those, is as soon as you add a new property (like CanReadUpsideDown), you have to make schema changes to accommodate for that - in (1) you need to add a new column and a new constraint, in (2) you need to add a new table if that represents a new "type" of employee.
(3) EAV, where you have a single table that stores property name and value pairs. You have less control over data integrity here, but you can certainly constraint the property names to certain strings. I wrote about this here:
What is so bad about EAV, anyway?
You are describing one ("class per table") of the 3 possible strategies for implementing the category (aka. inheritance, generalization, subclass) hierarchy.
The correct "propagation" of PK from the parent to child tables is naturally enforced by straightforward foreign keys between them, but ensuring both presence and the exclusivity of the child rows is another matter. It can be done (as noted in the link above), but the added complexity is probably not worth it and I'd generally recommend handling it at the application level.
I would add a field called EmployeeId in the EmployeeTable
I'd get rid of Type
For BusinessManager table and SoftwareDeveloper for example, I'll add EmployeeId
From here, you can then proceed to create Foreign Keys from BusinessManager, SoftwareDeveloper table to Employee
To further expand on your one way with the core table is to create a surrogate key based off an identity column. This will create a unique employee id for each employee (this will help you distinguish between employees with the same name as well).
The foreign keys preserve your referential integrity. You wouldn't necessarily need EmployeeTypeId as someone else mentioned as you could filter on existence in the SoftwareDeveloper or BusinessManagers tables. The column would instead act as a cached data point for easier querying.
You have to fill in the types in the below sample code and rename the foreign keys.
create table EmployeeType(
EmployeeTypeId
, EmployeeTypeName
, constraint PK_EmployeeType primary key (EmployeeTypeId)
)
create table Employees(
EmployeeId int identity(1,1)
, Name
, Family
, EmployeeTypeId
, constraint PK_Employees primary key (EmployeeId)
, constraint FK_blahblah foreign key (EmployeeTypeId) references EmployeeType(EmployeeTypeId)
)
create table SoftwareDeveloper(
EmployeeId
, Language
, Technology
, CanDevelopWeb
, constraint FK_blahblah foreign key (EmployeeId) references Employees(EmployeeId)
)
create table BusinessManagers(
EmployeeId
, FieldOfExpertise
, MaximumContractValue
, BonusRate
, constraint FK_blahblah foreign key (EmployeeId) references Employees(EmployeeId)
)
No existing SQL engine has solutions that make life easy on you in this situation.
Your problem is discussed at fairly large in "Practical Issues in Database Management", in the chapter on "entity subtyping". Commendable reading, not only for this particular chapter.
The proper solution, from a logical design perspective, would be similar to yours, but for the "type" column in the core table. You don't need that, since you can derive the 'type' from which non-core table the employee appears in.
What you need to look at is the business rules, aka data constraints, that will ensure the overall integrity (aka consistency) of the data (of course whether any of these actually apply is something your business users, not me, should tell you) :
Each named employee must have exactly one job, and thus some job detail somewhere. iow : (1) no named employees without any job detail whatsoever and (2) no named employees with >1 job detail.
(3) All job details must be for a named employee.
Of these, (3) is the only one you can implement declaratively if you are using an SQL engine. It's just a regular FK from the non-core tables to the core table.
(1) and (2) could be defined declaratively in standard SQL, using either CREATE ASSERTION or a CHECK CONSTRAINT involving references to other tables than the one the CHECK CONSTRAINT is defined on, but neither of those constructs are supported by any SQL engine I know.
One more thing about why [including] the 'type' column is a rather poor choice to make : it changes how constraint (3) must be formulated. For example, you can no longer say "all business managers must be named employees", but instead you'd have to say "all business managers are named employees whose type is <type here>". Iow, the "regular FK" to your core table has now become a reference to a VIEW on your core table, something you might want to declare as, say,
CREATE TABLE BUSMANS ... REFERENCES (SELECT ... FROM CORE WHERE TYPE='BM');
or
CREATE VIEW BM AS (SELECT ... FROM CORE WHERE TYPE='BM');
CREATE TABLE BUSMANS ... REFERENCES BM;
Once again something SQL doesn't allow you to do.
You can use all fields in the same table, but you'll need an extra table named Employee_Type (for example) and here you have to put Developer, Business Manager, ... of course with an unique ID. So your relation will be employee_type_id in Employee table.
Using PHP or ASP you can control what field you want to show depending the employee_type_id (or text) in a drop-down menu.
You are on the right track. You can set up PK/FK relationships from the general person table to each of the specialized tables. You should add a personID to all the tables to use for the relationship as you do not want to set up a relationship on name because it cannot be a PK as it is not unique. Also names change, they are a very poor choice for an FK relationship as a name change could cause many records to need to change. It is important to use separate tables rather than one because some of those things are in a one to many relationship. A Developer for instnce may have many differnt technologies and that sort of thing should NEVER be stored in a comma delimted list.
You could also set up trigger to enforce that records can only be added to a specialty table if the main record has a particular personType. However, be wary of doing this as you wil have peopl who change roles over time. Do you want to lose the history of wha the person knew when he was a developer when he gets promoted to a manager. Then if he decides to step back down to development (A frequent occurance) you would have to recreate his old record.

Designing a conditional database relationship in SQL Server

I have three basic types of entities: People, Businesses, and Assets. Each Asset can be owned by one and only one Person or Business. Each Person and Business can own from 0 to many Assets. What would be the best practice for storing this type of conditional relationship in Microsoft SQL Server?
My initial plan is to have two nullable foreign keys in the Assets table, one for People and one for Businesses. One of these values will be null, while the other will point to the owner. The problem I see with this setup is that it requires application logic in order to be interpreted and enforced. Is this really the best possible solution or are there other options?
Introducing SuperTypes and SubTypes
I suggest that you use supertypes and subtypes. First, create PartyType and Party tables:
CREATE TABLE dbo.PartyType (
PartyTypeID int NOT NULL identity(1,1) CONSTRAINT PK_PartyType PRIMARY KEY CLUSTERED
Name varchar(32) CONSTRAINT UQ_PartyType_Name UNIQUE
);
INSERT dbo.PartyType VALUES ('Person'), ('Business');
SuperType
CREATE TABLE dbo.Party (
PartyID int identity(1,1) NOT NULL CONSTRAINT PK_Party PRIMARY KEY CLUSTERED,
FullName varchar(64) NOT NULL,
BeginDate smalldatetime, -- DOB for people or creation date for business
PartyTypeID int NOT NULL
CONSTRAINT FK_Party_PartyTypeID FOREIGN KEY REFERENCES dbo.PartyType (PartyTypeID)
);
SubTypes
Then, if there are columns that are unique to a Person, create a Person table with just those:
CREATE TABLE dbo.Person (
PersonPartyID int NOT NULL
CONSTRAINT PK_Person PRIMARY KEY CLUSTERED
CONSTRAINT FK_Person_PersonPartyID FOREIGN KEY REFERENCES dbo.Party (PartyID)
ON DELETE CASCADE,
-- add columns unique to people
);
And if there are columns that are unique to Businesses, create a Business table with just those:
CREATE TABLE dbo.Business (
BusinessPartyID int NOT NULL
CONSTRAINT PK_Business PRIMARY KEY CLUSTERED
CONSTRAINT FK_Business_BusinessPartyID FOREIGN KEY REFERENCES dbo.Party (PartyID)
ON DELETE CASCADE,
-- add columns unique to businesses
);
Usage and Notes
Finally, your Asset table will look something like this:
CREATE TABLE dbo.Asset (
AssetID int NOT NULL identity(1,1) CONSTRAINT PK_Asset PRIMARY KEY CLUSTERED,
PartyID int NOT NULL
CONSTRAINT FK_Asset_PartyID FOREIGN KEY REFERENCES dbo.Party (PartyID),
AssetTag varchar(64) CONSTRAINT UQ_Asset_AssetTag UNIQUE
);
The relationship the supertype Party table shares with the subtype tables Business and Person is "one to zero-or-one". Now, while the subtypes generally have no corresponding row in the other table, there is the possibility in this design of having a Party that ends up in both tables. However, you may actually like this: sometimes a person and a business are nearly interchangeable. If not useful, while a trigger to enforce this will be fairly easily done, the best solution is probably to add the PartyTypeID column to the subtype tables, making it part of the PK & FK, and put a CHECK constraint on the PartyTypeID.
The beauty of this model is that when you want to create a column that has a constraint to a business or a person, then you make the constraint to the appropriate table instead of the party table.
Also, if desired, turning on cascade delete on the constraints can be useful, as well as an INSTEAD OF DELETE trigger on the subtype tables that instead delete the corresponding IDs from the supertype table (this guarantees no supertype rows that have no subtype rows present). These queries are very simple and work at the entire-row-exists-or-doesn't-exist level, which in my opinion is a gigantic improvement over any design that requires checking column value consistency.
Also, please notice that in many cases columns that you would think should go in one of the subtype tables really can be combined in the supertype table, such as social security number. Call it TIN (taxpayer identification number) and it works for both businesses and people.
ID Column Naming
The question of whether or not to call the column in the Person table PartyID, PersonID, or PersonPartyID is your own preference, but I think it's best to call these PersonPartyID or BusinessPartyID—tolerating the cost of the longer name, this avoids two types of confusion. E.g., someone unfamiliar with the database sees BusinessID and doesn't know this is a PartyID, or sees PartyID and doesn't know it is restricted by foreign key to just those in the Business table.
If you want to create views for the Party and Business tables, they can even be materialized views since it's a simple inner join, and there you could rename the PersonPartyID column to PersonID if you were truly so inclined (though I wouldn't). If it's of great value to you, you can even make INSTEAD OF INSERT and INSTEAD OF UPDATE triggers on these views to handle the inserts to the two tables for you, making the views appear completely like their own tables to many application programs.
Making Your Proposed Design Work As-Is
Also, I hate to mention it, but if you want to have a constraint in your proposed design that enforces only one column being filled in, here is code for that:
ALTER TABLE dbo.Assets
ADD CONSTRAINT CK_Asset_PersonOrBusiness CHECK (
CASE WHEN PersonID IS NULL THEN 0 ELSE 1 END
+ CASE WHEN BusinessID IS NULL THEN 0 ELSE 1 END = 1
);
However, I don't recommend this solution.
Final Thoughts
A natural third subtype to add is organization, in the sense of something that people and businesses can have membership in. Supertype and subtype also elegantly solve customer/employee, customer/vendor, and other problems similar to the one you presented.
Be careful not to confuse "Is-A" with "Acts-As-A". You can tell a party is a customer by looking in your order table or viewing the order count, and may not need a Customer table at all. Also don't confuse identity with life cycle: a rental car may eventually be sold, but this is a progression in life cycle and should be handled with column data, not table presence--the car doesn't start out as a RentalCar and get turned into a ForSaleCar later, it's a Car the whole time. Or perhaps a RentalItem, maybe the business will rent other things too. You get the idea.
It may not even be necessary to have a PartyType table. The party type can be determined by the presence of a row in the corresponding subtype table. This would also avoid the potential problem of the PartyTypeID not matching the subtype table presence. One possible implementation is to keep the PartyType table, but remove PartyTypeID from the Party table, then in a view on the Party table return the correct PartyTypeID based on which subtype table has the corresponding row. This won't work if you choose to allow parties to be both subtypes. Then you would just stick with the subtype views and know that the same value of BusinessID and PersonID refer to the same party.
Further Reading On This Pattern
Please see A Universal Person and Organization Data Model for a more complete and theoretical treatment.
I recently found the following articles to be useful for describing some alternate approaches for modeling inheritance in a database. Though specific to Microsoft's Entity Framework ORM tool, there's no reason you couldn't implement these yourself in any DB development:
Table Per Hierarchy
Table Per Type (this is what I advocate above as the only fully normalized method of implementing inheritance in a database)
Table Per Concrete Class
Or a more brief overview of these three ways: How to choose an inheritance strategy
P.S. I have switched, more than once, my opinion on column naming of IDs in subtype tables, due to having more experience under my belt.
You don't need application logic to enforce this. The easiest way is with a check constraint:
(PeopleID is null and BusinessID is not null) or (PeopleID is not null and BusinessID is null)
You can have another entity from which Person and Business "extend". We call this entity Party in our current project. Both Person and Business have a FK to Party (is-a relationship). And Asset may have also a FK to Party (belongs to relationship).
With that said, if in the future an Asset can be shared by multiple instances, is better to create m:n relationships, it gives flexibility but complicates the application logic and the queries a bit more.
ErikE's answer gives a good explanation on how to go about the supertype / subtype relationship in tables and is likely what I'd go for in your situation, however, it doesn't really address the question(s) you've posed which are also interesting, namely:
What would be the best practice for storing this type of conditional relationship in Microsoft SQL Server?
...are there other options?
For those I recommend this blog entry on TechTarget which has an excerpt from excerpt from "A Developer's Guide to Data Modeling for SQL Server, Covering SQL Server 2005 and 2008" by Eric Johnson and Joshua Jones which addresses 3 possible options.
In summary they are:
Supertype Table - Almost matches what you've proposed, have a table with some fields that will always be null when others are filled. Good when only a couple of fields aren't shared. So depending on how different Business and People are you could possibly combine them into one table, Owners perhaps, and then just have OwnerID in your Asset table.
Subtype Tables - Basically the opposite of what Supertype tables are and is what you have just now. Here we have lots of unique fields and maybe one or two the same so we just have the repeated fields appear in each table. As you are finding this isn't really suitable for your situation.
Supertype and Subtype Tables - A combination of both of the above where the matching fields are placed in a single table and the unique ones in separate tables and matching IDs are used to join the record from one table to the other. This matches ErikE's proposed solution and, as mentioned, is the one I would favour as well.
Sadly it doesn't go on to explain which, if any, are best practice but it is certainly a good read to get an idea of the options that are out there.
YOu can enforce the logic with a trigger instead. Then no matter how the record is changed, only one of the fileds will be filled in.
You could also have a PeopleAsset table and a BusinessAsset table, but stillwould have the problem of enforcing that only one of them has a record.
An asset would have a foreign key to the owning person, and you should setup an association table to link assets and businesses. As said in other comments, you can use triggers and/or constraints to ensure that the data stays in a consistent state. ie. when you delete a business, delete the lines in your association table.
Table People, Businesses both can use UUID as primary key, and union both to a view for sql join purpose.
so you can simply use one foreign key column in Assets relation to both People and Businesses, because UUID is nearly unique. And you can simply query like:
select * from Assets
join view_People_Businesses as v on v.id = Assets.fk

Resources