EER : Superclass/subclass Entity relationship, primary key mapping - database

Here is the scenario.
STUDENT, FACULTY are sub-classes of PERSON entity, and they have specialized attributes.
Normally, we store common attributes in PERSON table (with p_id as pk) and store specialized in the subclasses. We map the subclass to the superclass using p_id by creating a column in the subclass.
However, is it acceptable to do something like following.
Instead of p_id as the mapping attribute in subclass, can we use something else belonging to the superclass which is unique but not pk.
NOTE: The EER Diagram (conceptual design) still remains same!

It's just a foreign key, even for supertype/subtype schemas. You can reference any column that's declared UNIQUE.
I'm pretty sleepy, so I'm not sure how that would affect the updatable views. I don't think it would affect them, though. (Assuming you're using them. Some don't bother.)

Related

Database: foreign keys between class and just 1 of the subclasses. Use single FK column?

I'm reworking and extending an existing data model where a section covers person data. The current person table is just 1 big table containing all fields, both for natural and legal persons and the non-relevant fields just remain empty.
As we're adding more and more fields, the idea would be to have a single PERSON table and 2 subclasses NATURALPERSON and LEGALPERSON, where a person could never be both at the same time.
Sounds easy enough but started reading and doubting my initial approach. What would you do?
First option I had in mind was to have a single column in the PERSON table for the Foreign key, LEGAL_NATURAL, which would be a pointer to either LEGALPERSON or NATURALPERSON. To ensure mutually exclusiveness the record ID's for the subclasses could be constructed using a single sequence.
SELECT *
FROM PRSN pr
left join LEGALPERSON_DETAIL lp on pr.legal_natural = lp.id
left join NATURALPERSON_DETAIL np on pr.legal_natural = np.id;
Instead of the 1 column for 2 FK's, an alternative would be to have 2 columns in the PERSON table (e.g. NATURALPERSON, LEGALPERSON), each with a possible pointer to a subclass. A constraint could then make sure both aren't filled at the same time. Could make the FK relationship more obvious.
Different approach would be to have the subclasses point to the PERSON table. Has the disadvantage that in the PERSON table it's not visible whether it's a natural or legal person record but might be a nicer design overall.
Found some info on exclusive arcs on Database development mistakes made by application developers.
Is there a clear winner here?
The design of your three tables looks good.
AS far as PKs and FKs are concerned I recommend a technique called Shared Primary Key.
The person table has an ID field which functions as a PK.
The natural person and legal person tables do not have an independent ID field. Instead, both subclasses use PersonID as the PK in their own table. PersonID is also an FK that references ID in the person table.
This makes joins simple, easy, and fast.

Relational Database Inheritance foreign keys and primary keys

I'm working on a Database in which I'm trying to deduce the best ways to apply inheritance.
So far I was having 2 subclasses from an Entity, and I asked in Extended Entity-Relationship Model to tables (subclasses) about how to implement it on relational tables.
I decided to go with Concrete Table so I created 2 tables, one for each subclass of the Entity. I encountered 2 problems:
My primary keys were id int primary key autoincrement, which means the first row of each table is gonna have id = 1. So the key isn't actually unique, So when referencing it from another tables, there is no way to know which of the 2 table subclasses are we referencing (unless I add an unnecesary(?) extra column.
When adding a Foreign Key that references said id, the foreign key is supposed to referece both subclasses tables, but I don't know if that is even possible.
Any ideas or opinions about how this could be done could help a lot. thanks
It would probably make sense to have the child class tables reference the parent class, instead of the other way around. Then you can have an id column on the Entity table which is unique and foreign keys from the children to their parent instances. Presumably this will help when you want to use the data to instantiate an object in your code as well, since you should know which class you are instantiating and only care about its ancestors, not its children.

Implementing tree in a database with child having more than one parent

I've table with fields Id,Name,ParentId(Id) & Leaf. I want to model tree like structure where child/element with Leaf=1 can have more than more parent. How can I model this situation in this table or do I need an extra table to handle this thing. I want this modelling for implementing Tags like in Stack overflow.
You'll need another table, unless "more than one parent" has a smallish upper limit, in which case you can add ParentID fields for the number of possible parents, but this is not recommended.
You appear to have a many-to-many relationship. This can be modelled as below:
Entity table
ID (Primary Key)
... - other entity fields
Parents table
ChildID (Foreign Key - Entity.ID)
ParentID (Foreign Key - Entity.ID)
The Leaf=1 entities being the only ones allowed to have multiple parents is a constraint that is best handled on a code level, or possibly with database triggers.
It doesn't seem possible to enforce this directly without creating another table (a third one) (which will contain all entities with Leaf=1, either linking to an Entity entry or having the row defined only there, though I would not advise either - it's messy and not the type of constraint you design your database around).

Tables with a common primary key

What's the term describing the relationship between tables that share a common primary key?
Here's an example:
Table 1
property(property_id, property_location, property_price, ...);
Table 2
flat(property_id, flat_floor, flat_bedroom_count, ...);
What you have looks like table inheritance. If your table structure is that all flat records represent a single property but not all property records refer to a flat, then that's table inheritance. It's a way of modeling something close to object-oriented relationships (in other words, flat inherits from property) in a relational database.
If I understand your example correctly, the data modeling term is Supertype/Subtype. This is a modeling technique where you define a root table (the supertype) containing common attributes, and one or more referencing tables (subtypes) that contain varying attributes based on the entities being modeled.
For example, you could have a Person table (the supertype) containing columns for attributes pertaining to all people, such as Name. You could then have an Employee table (the subtype) containing attributes specific to employees only, such as rate of pay and hire date. You could then continue this process with additional tables for other specializations of Person, such as Contractor. Each of the subtype tables would have a PersonID key column, which could be the primary key of the subtype table, as well as a foreign key referencing the Person table.
For additional info, search Google for "supertype and subtype entities", and see the links below.
http://www.learndatamodeling.com/dm_super_type.htm
http://technet.microsoft.com/en-us/library/cc505839.aspx
There isn't a good name for this relationship in common database terminology (as far as I know). It's not a one-to-one relationship because there isn't guaranteed to be a record in the "extending" table for each record in the main table. It's not a one-to-many relationship because there a maximum of one record allowed on what would otherwise be the "many" side of the relationship.
The best I can do is a one-to-one-or-none or a one-to-one-at-most relationship. (I will admit to sloppy terminology myself — I just call it a one-to-one relationship.)
Whatever you decide to call it, you can model it properly and maintain integrity in your database by making the property_id column in property a PK and the property_id column in flat a PK and also an FK back to property.
"Logic and Databases" advances the term "at most one to at most one" for this kind of relationship. (Note that it is insane to assign names to tables on account of which relationships they participate in.)
Beware of the people who have suggested things like "foreign key", "table inheritance", brief, all the other answers given here. Those people are making assumptions that you have not explicitly stated to be valid, namely that one of your two tables will be guaranteed to contain all key values that appear in the other.
(Disfunctionality of the site prevents me from adding this as a comment in the proper place.)
"How would you interpret "...that share a common primary key?" "
I interpret that in the only reasonable sense possible: that within table1, attribute values for the primary key are guaranteed to be unique, and that within table2, attribute values for the primary key are guaranteed to be unique. And that furthermore, the primary key in both tables has the same [set of] attribute names, and that the types corresponding to the primary key attribute[s] are also pairwise the same. Nothing more and nothing less.
In particular, "sharing a primary key" means "having a primary key in common", and that means in turn "having a certain 'internal uniqueness rule' in common", but that commonality guarantees in no way that a primary key value appearing in one table must also appear in the second table.
"Can you give an example involving two tables with shared primary keys where one table wouldn't contain all the key values that appear in the other?" "
Table1: column A of type integer, primary key A
Table2: column A of type integer, primary key A
Rows in table1: {A:1}. Satisfies the primary key for table1.
Rows in table2: {A:2}. Satisfies the primary key for table2.
Convinced ?
"Foreign key"?

Is there an answer matrix I can use to decide if I need a foreign key or not?

For example, I have a table that stores classes, and a table that stores class_attributes. class_attributes has a class_attribute_id and a class_id, while classes has a class_id.
I'd guess if a dataset is "a solely child of" or "belongs solely to" or "is solely owned by", then I need a FK to identify the parent. Without class_id in the class_attributes table I could never find out to which class this attribute belongs to.
Maybe there's an helpful answer matrix for this?
Wikipedia is helpful.
In the context of relational
databases, a foreign key is a
referential constraint between two
tables.1 The foreign key identifies
a column or a set of columns in one
(referencing) table that refers to a
column or set of columns in another
(referenced) table. The columns in the
referencing table must be the primary
key or other candidate key in the
referenced table.
(and it goes on into more and more detail)
If you want to enforce the constraint that each row in class_attributes applies to exactly one row of classes, you need a foreign key. If you don't care about enforcing this (ie, you're fine to have attributes for non-existent classes), you don't need an FK.
I don't have an answer matrix, but just for clarification purposes, we're talking about Database Normalization:
http://en.wikipedia.org/wiki/Database_normalization
And to a certain extent Denormalization:
http://en.wikipedia.org/wiki/Denormalization
I would say, it's the other way around. First, you design what kind of objects you need to have. For those will create a table.
Part of this phase is designing the keys, that is the combinations of attributes (columns) that uniquely identify the object. You may or may not add an artificial key or surrogate key for convenience or performance reasons. From these keys, you typically elect one canonical key, the primary key, which you try to use consistently to identify objects in that table (you keep the other keys too, they serve to ensure unicity as a business rule, not so much for identificattion purposes.)
Then, you think what relationships exist between the objects. An object that is 'owned' by another object, or an object that refers to another object needs some way to identify its related object. In the corresponding table (child table) you add columns to make a foreign key to point to the primary key of the referenced table.
This takes care of all one to many relationships.
Sometimes, an object can be related multiple times to another object. For example, an order can be used to order multiple products, but a product can appear on multiple orders as well. For those relationships, you design a separate table (intersection table - in this example, order_items). This table will have a unique key created from two foreign keys: one pointing to the one parent (orders), one to the other parent (products). And again, you add the columns to the intersection table that you need to create those foreign keys.
So in short, you first design keys and foreign keys, only then you start adding columns to implement them.
Don't be concerned with the type of relationship -- it has more to do with the cardinality of the relationship.
If you have a one-to-many relationship, then you'd want to assign a Primary Key to the smaller of the tables, and store it as a Foreign Key in the larger table.
You'd also do it with one-to-one relationships, but some people argue that you should avoid them.
In the case of a many-to-many relationship, you'd want to make a join table, and then have each of the original tables have a foreign key to the join table.

Resources