Data structure using master id - sql-server

I have a database with tables A and B in a one-to-many relationship. So one entity in A can be assigned to multiple and differing entities in B. A and B each have their own specific fields, but there are also fields and workflows related to either A or B, which are basically the same data but related only to either A or B.
As an example, an entity in A can have multiple comments for differing reasons and so can entities in B. Since there can be multiple comments for a single record I have to have a related comment table outside of tables A and B. I didn't want to have two comment tables, one for A and a separate table for B, so I set up a MasterID table that is related to both A and B and has referential integrity enforced. This means that when I want to add a record in A or B, I have to make sure that a MasterID already exists in the MasterID table. There are other tables that have the same type of functionality, comments is just one example, but if I didn't use a MasterID I'd have to create multiple tables each for A and B.
So my question is, is this the correct way to do this? Is there another way? The front-end will be in Access so I'm running into a little bit of trouble making sure a MasterID is created right before creating a new record in A or B.
MasterID(MasterID)
TableA(TableAID, FK_MasterID)
TableB(TableBID, FK_MasterID, FK_TableAID)
Comments(CommentID, MasterID, Comment)
Thanks for any help.

From a pure data design standpoint, you are on the right track, but not quite. You can use an entity-subtyping approach in which A and B are subtypes of another entity (MasterID). It is this supertype entity which attracts comments. However, for this to be true subtyping, the PK of A and the PK of B would be the FK to MasterID.
The way you've designed your tables, they have two candidate keys. If you eliminate the redundant candidate keys, then you have a standard entity-subtyping pattern, which is a legitimate and commonly used design approach.

Based on my understanding for the problem, I think this is too complex for a little value. If I understand you correctly, you have a situation like the picture and you want to make the key for Comment unique.
Creating a fourth table could work but it adds unnecessary complexity.
What you could do instead is to make the key for the Comments table a compound key of the two columns one is a sequence number and the other is a character field indicating the parent table. So you get keys like (A,1), (A,2), (B,3), (A,4), (B,5) ...etc.
This way you don't need the master table, and you don't need FKs in Table A or B.

Related

Normalization (3NF) on a simple table of two columns

I have a table with only two attributes (DeliveryPerson and DeliveryTime). Each person can deliver a “product” at a specific Delivery Time. As you can see below John for example has delivered three products at different delivery times.
According to my task, I must put this table in 3NF, but I am confused because I cannot set “deliveryPerson” as a primary key because there are repeated values in this column. Is there any way of setting up this table to satisfy 3NF? If that is not possible, is it correct to have a table like this in a DB without a Primary key?
Thank you very much!
Normalisation is not about adding Primary Keys to a table where you've already decided the columns, it's about deciding what tables and columns you need in the first place. The inability to define a Primary Key on this table is the problem you've been asked to solve; the solution will involve creating new tables.
Rather than looking at the table, look at the data you're trying to model:
There are four (and probably any number of) delivery people
Each delivery person can have one or more (maybe even zero) delivery times
A normalised database will represent each of those separately. I'll leave the details for you to work out, rather than feeding you the full answer.
There are plenty of tutorials available which will probably explain it better than me.

Separate tables for 1-1 relationship

I'm creating an Access database to hold student internship information. The issue I'm having is I have three tables that have a one and only one relationship with the internship table (Assignment, Supervisor Evaluation, and Student Evaluation).
Since Access doesn't allow a table to have more than one auto generated number, I can't let the internship table create the ID number for each of the three tables. So, I'm not sure how to make it so when we enter data into these tables forms, I can assign it specifically to an internship. Any advice?
1-1 relationships always smell like they should be merged into one table. This is particularly so if they are actually 1-1 and shouldn't be 1-0,1. In the latter case, if the dependent information can be missing and will be missing in a majority of cases, it might be helpful to separate it away into a table of its own. But even this can be expressed by giving null values to certain attributes.
Now if, for some reason, you insist on those 4 tables, there are two ways to go for the primary keys. One is, for the dependent tables, not to declare the primary key as auto-generated, but just as a number, and to assign to it the autogenerated value of the Internship record. Another is to auto-generate a primary key for each of the dependent tables, and have a foreign key in the Intership table for each of them. As I consider the entire construct of those dependent tables as unnecessarily complicated, I can't give a recommendation on which of these ways to prefer.
There is another concern I have about your data model. Your tables have those attributes like answer1, answer2, ... Now if you have a small fixed amount of those attributes, this might be okay. But could you have a larger set of fixed questions, maybe for each type of internship, that might vary dynamically and can't just be expressed by a fixed column structure? In that case you would need something like
Question(id, text)
Internship(id, ...)
Answer(id, internship_id, question_id, student_answer, supervisor_evaluation)
So your cardinalities would be
Internship 1-----0,n Answer 0,n------1 Question
Same for the other details of the internship.

What's the best practice for a table that refers n tables with a match in one of them?

I'm working on a database design, and I face a situation where notifications will be sent according to logs in three tables, each log contains different data. NOTIFICATIONS table should then refer these three tables, and I thought of three possible designs, each seems to have flaws in it:
Each log table will have a unique incremented id, and NOTIFICATIONS table will have three different columns as FK's. The main flaw in this design is that I can't create real FK's since two of the three fields will be NULL for each row, and the query will have to "figure out" what kind of data is actually logged in this row.
The log tables will have one unique incremented id for all of them. Then I can make three OUTER JOINS with these tables when I query NOTIFCATIONS, and each row will have exactly one match. This seems at first like a better design, but I will have gaps in each log table and the flaws in option 1 still exist.
Option 1/2 + creating three notifications tables instead of one. This option will require the app to query notifications using UNION ALL.
Which option makes a better practice? Is there another way I didn't think of? Any advice will be appreciated.
I have one solution that sacrifices the referential integrity to help you achieve what you want.
You can keep a GUID data type as the primary key in all three log tables. In the Notification table you just need to add one foreign key column which won't point to any particular table. So only you know that it is a foreign key, SQL Server doesn't and it doesn't enforce referential integrity. In this column you store the GUID of notification. The notification can be in any of the three logs but since the primary key of all three logs is GUID, you can store the key in your Notification table.
Also you add another column in the Notification table to tell which of the three logs this GUID belongs to. Now you can uniquely know which row in the required log table you have to go to in order to find this notification info.
The problem is that you have three separate log tables. Instead you should have had only log table which would have an extra column specifying what kind of logging is it. That way you'd have only one table - referential integrity would have stayed and design would have been simple.
Use one table holding notification ids. Each of the three original tables hold subtypes of notification ids with FKs on their own ids to that table. Search re subtyping/subtables in databases. This is a standard design pattern/idiom.
(There are entities. We group them conceptually. We call the groups kinds or types. We say of a particular entity that it is a whatever kind or type of entity, or even that it "is a" whatever. We can have groups that contain all the entities of another group. Since the larger is a superset of the smaller we say that the larger type is a supertype of the smaller type, and the smaller is a subtype of the larger.)
There are idioms you can use to help constrain your tables declaratively. The main one is to have a subtype tag in the supertype table, and even also in the subtype tables (where each table has only one tag value).
I eventually faced two main options:
Following the last suggestion in this answer.
Choosing a less normalized structure for the database, AKA fake/no FK's. To be precise, in my case it would be my second option above with fake FK's.
I chose option #2 as a DBA whom I consulted enlightened me on the idea that database normalization should be done according to possible structure breakage. In my case, although notifications are created based on logs, these FK's are not necessary for querying the notifications nor for querying the log and the app do not have to ensure this relationship for a proper functioning. Thus, following option #1 may be "over-normalization".
Thanks all for your answers and comments.

Database Mapping - Multiple Foreign Keys

I want to make sure this is the best way to handle a certain scenario.
Let's say I have three main tables I will keep them generic. They all have primary keys and they all are independent tables referencing nothing.
Table 1
PK
VarChar Data
Table 2
PK
VarChar Data
Table 3
PK
VarChar Data
Here is the scenario, I want a user to be able to comment on specific rows on each of the above tables. But I don't want to create a bunch of comment tables. So as of right now I handled it like so..
There is a comment table that has three foreign key columns each one references the main tables above. There is a constraint that only one of these columns can be valued.
CommentTable
PK
FK to Table1
FK to Table2
FK to Table3
VarChar Comment
FK to Users
My question: is this the best way to handle the situation? Does a generic foreign key exist? Or should I have a separate comments table for each main table.. even though the data structure would be exactly the same? Or would a mapping table for each one be a better solution?
My question: is this the best way to handle the situation?
Multiple FKs with a CHECK that allows only one of them to be non-NULL is a reasonable approach, especially for relatively few tables like in this case.
The alternate approach would be to "inherit" the Table 1, 2 and 3 from a common "parent" table, then connect the comments to the parent.
Look here and here for more info.
Does a generic foreign key exist?
If you mean a FK that can "jump" from table to table, then no.
Assuming all 3 FKs are of the same type1, you could theoretically implement something similar by keeping both foreign key value and referenced table name2 and then enforcing it through a trigger, but declarative constraints should be preferred over that, even at a price of slightly more storage space.
If your DBMS fully supports "virtual" or "calculated" columns, then you could do something similar to above, but instead of having a trigger, generate 3 calculated columns based on FK value and table name. Only one of these calculated columns would be non-NULL at any given time and you could use "normal" FKs for them as you would for the physical columns.
But, all that would make sense when there are many "connectable" tables and your DBMS is not thrifty in storing NULLs. There is very little to gain when there are just 3 of them or even when there are many more than that but your DBMS spends only one bit on each NULL field.
Or should I have a separate comments table for each main table, even though the data structure would be exactly the same?
The "data structure" is not the only thing that matters. If you happen to have different constraints (e.g. a FK that applies to one of them but not the other), that would warrant separate tables even though the columns are the same.
But, I'm guessing this is not the case here.
Or would a mapping table for each one be a better solution?
I'm not exactly sure what you mean by "mapping table", but you could do something like this:
Unfortunately, that would allow a single comment to be connected to more than one table (or no table at all), and is in itself a complication over what you already have.
All said and done, your original solution is probably fine.
1 Or you are willing to store it as string and live with conversions, which you should be reluctant to do.
2 In practice, this would not really be a name (as in string) - it would be an integer (or enum if DBMS supports it) with one of the well-known predefined values identifying the table.
Thanks for all the help folks, i was able to formulate a solution with the help of a colleague of mine. Instead of multiple mapping tables i decided to just use one.
This mapping table holds a group of comments, so it has no primary key. And each group row links back to a comment. So you can have multiple of the same group id. one-many-one would be the relationship.

Foreign Key Referencing Multiple Tables

I have a column with a uniqueidentifier that can potentially reference one of four different tables. I have seen this done in two ways, but both seem like bad practice.
First, I've seen a single ObjectID column without explicitly declaring it as a foreign key to a specific table. Then you can just shove any uniqueidentifier you want in it. This means you could potentially insert IDs from tables that are not part of the 4 tables I wanted.
Second, because the data can come from four different tables, I've also seen people make 4 different foreign keys. And in doing so, the system relies on ONE AND ONLY ONE column having a non-NULL value.
What's a better approach to doing this? For example, records in my table could potentially reference Hospitals(ID), Clinics(ID), Schools(ID), or Universities(ID)... but ONLY those tables.
Thanks!
You might want to consider a Type/SubType data model. This is very much like class/subclasses in object oriented programming, but much more awkward to implement, and no RDBMS (that I am aware of) natively supports them. The general idea is:
You define a Type (Building), create a table for it, give it a primary key
You define two or more sub-types (here, Hospital, Clinic, School, University), create tables for each of them, make primary keys… but the primary keys are also foreign keys that reference the Building table
Your table with one “ObjectType” column can now be built with a foreign key onto the Building table. You’d have to join a few tables to determine what kind of building it is, but you’d have to do that anyway. That, or store redundant data.
You have noticed the problem with this model, right? What’s to keep a Building from having entries in in two or more of the subtype tables? Glad you asked:
Add a column, perhaps “BuildingType”, to Building, say char(1) with allowed values of {H, C, S, U} indicating (duh) type of building.
Build a unique constraint on BuildingID + BuildingType
Have the BulidingType column in the subtables. Put a check constraint on it so that it can only ever be set to the value (H for the Hospitals table, etc.) In theory, this could be a computed column; in practice, this won't work because of the following step:
Build the foreign key to relate the tables using both columns
Voila: Given a BUILDING row set with type H, an entry in the SCHOOL table (with type S) cannot be set to reference that Building
You will recall that I did say it was hard to implement.
In fact, the big question is: Is this worth doing? If it makes sense to implement the four (or more, as time passes) building types as type/subtype (further normalization advantages: one place for address and other attributes common to every building, with building-specific attributes stored in the subtables), it may well be worth the extra effort to build and maintain. If not, then you’re back to square one: a logical model that is hard to implement in the average modern-day RDBMS.
Let's start at the conceptual level. If we think of Hospitals, Clinics, Schools, and Universities as classes of subject matter entities, is there a superclass that generalizes all of them? There probably is. I'm not going to try to tell you what it is, because I don't understand your subject matter as well as you do. But I'm going to proceed as if we can call all of them "Institutions", and treat each of the four as subclasses of Institutions.
As other responders have noted, class/subclass extension and inheritance are not built into most relational database systems. But there is plenty of assistance, if you know the right buzzwords. What follows is intended to teach you the buzzwords, in database lingo. Here is a summary of the buzzwords coming: "ER Generalization", "ER Specialization", "Single Table Inheritance", "Class Table Inheritance", "Shared Primary Key".
Staying at the conceptual level, ER modeling is a good way of understanding the data at a conceptual level. In ER modeling, there is a concept, "ER Generalization", and a counterpart concept "ER Specialization" that parallel the thought process I just presented above as "superclass/subclass". ER Specialization tells you how to diagram subclasses, but it doesn't tell you how to implement them.
Next we move down from the conceptual level to the logical level. We express the data in terms of relations or, if you will, SQL tables. There are a couple of techniques for implementing subclasses. One is called "Single Table Inheritance". The other is called "Class Table Inheritance". In connection with Class table inheritance, there is another technique that goes by the name "Shared primary Key".
Going forward in your case with class table inheritance, we first design a table called "Institutions", with an Id field, a name field, and all of the fields that pertain to institutions, no matter which of the four kinds they are. Things like mailing address fields, for instance. Again, you understand your data better than I do, and you can find fields that are in all four of your existing tables. We populate the id field in the usual way.
Next we design four tables called "Hospitals", "Clinics", "Schools", and "Universities". These will contain an id field, plus all of the data fields that pertain only to that kind of institution. For instance, a hospital might have a "bed capacity". Again, you understand your data better than I do, and you can figure these out from the fields in your existing tables that didn't make it into the Institutions table.
This is where "shared primary key" comes in. When a new entry is made into "Institutions", we have to make a new parallel entry into one of four specialized subclass tables. But we don't use some sort of autonumber feature to populate the id field. Instead, we put a copy of the id field from the "Institutions" table into the id field of the subclass table.
This is a little work, but the benefits are well worth the effort. Shared primary key enforces the one-to-one nature of the relationship between subclass entries and superclass entries. It makes joining superclass data and subclass data simple, easy, and fast. It eliminates the need for a special field to tell you which subclass a given institution belongs in.
And, in your case, it provides a handy answer to your original question. The foreign key you were originally asking about is now always a foreign key to the Institutions table. And, because of the magic of shared-primary-key, the foreign key also references the entry in the appropriate subclass table, with no extra work.
You can create four views that combine institution data with each of the four subclass tables, for convenience.
Look up "ER Specialization", "Class Table Inheritance", "Shared Primary Key", and maybe "Single Table Inheritance" on the web, and here in SO. There are tags for most of these concepts or techniques here in SO.
You could put a trigger on the table and enforce the referential integrity there. I don't think there's a really good out-of-the-box feature to implement this requirement.

Resources