I have a table with only two attributes (DeliveryPerson and DeliveryTime). Each person can deliver a “product” at a specific Delivery Time. As you can see below John for example has delivered three products at different delivery times.
According to my task, I must put this table in 3NF, but I am confused because I cannot set “deliveryPerson” as a primary key because there are repeated values in this column. Is there any way of setting up this table to satisfy 3NF? If that is not possible, is it correct to have a table like this in a DB without a Primary key?
Thank you very much!
Normalisation is not about adding Primary Keys to a table where you've already decided the columns, it's about deciding what tables and columns you need in the first place. The inability to define a Primary Key on this table is the problem you've been asked to solve; the solution will involve creating new tables.
Rather than looking at the table, look at the data you're trying to model:
There are four (and probably any number of) delivery people
Each delivery person can have one or more (maybe even zero) delivery times
A normalised database will represent each of those separately. I'll leave the details for you to work out, rather than feeding you the full answer.
There are plenty of tutorials available which will probably explain it better than me.
Related
I'm creating an Access database to hold student internship information. The issue I'm having is I have three tables that have a one and only one relationship with the internship table (Assignment, Supervisor Evaluation, and Student Evaluation).
Since Access doesn't allow a table to have more than one auto generated number, I can't let the internship table create the ID number for each of the three tables. So, I'm not sure how to make it so when we enter data into these tables forms, I can assign it specifically to an internship. Any advice?
1-1 relationships always smell like they should be merged into one table. This is particularly so if they are actually 1-1 and shouldn't be 1-0,1. In the latter case, if the dependent information can be missing and will be missing in a majority of cases, it might be helpful to separate it away into a table of its own. But even this can be expressed by giving null values to certain attributes.
Now if, for some reason, you insist on those 4 tables, there are two ways to go for the primary keys. One is, for the dependent tables, not to declare the primary key as auto-generated, but just as a number, and to assign to it the autogenerated value of the Internship record. Another is to auto-generate a primary key for each of the dependent tables, and have a foreign key in the Intership table for each of them. As I consider the entire construct of those dependent tables as unnecessarily complicated, I can't give a recommendation on which of these ways to prefer.
There is another concern I have about your data model. Your tables have those attributes like answer1, answer2, ... Now if you have a small fixed amount of those attributes, this might be okay. But could you have a larger set of fixed questions, maybe for each type of internship, that might vary dynamically and can't just be expressed by a fixed column structure? In that case you would need something like
Question(id, text)
Internship(id, ...)
Answer(id, internship_id, question_id, student_answer, supervisor_evaluation)
So your cardinalities would be
Internship 1-----0,n Answer 0,n------1 Question
Same for the other details of the internship.
I'm working on a database design, and I face a situation where notifications will be sent according to logs in three tables, each log contains different data. NOTIFICATIONS table should then refer these three tables, and I thought of three possible designs, each seems to have flaws in it:
Each log table will have a unique incremented id, and NOTIFICATIONS table will have three different columns as FK's. The main flaw in this design is that I can't create real FK's since two of the three fields will be NULL for each row, and the query will have to "figure out" what kind of data is actually logged in this row.
The log tables will have one unique incremented id for all of them. Then I can make three OUTER JOINS with these tables when I query NOTIFCATIONS, and each row will have exactly one match. This seems at first like a better design, but I will have gaps in each log table and the flaws in option 1 still exist.
Option 1/2 + creating three notifications tables instead of one. This option will require the app to query notifications using UNION ALL.
Which option makes a better practice? Is there another way I didn't think of? Any advice will be appreciated.
I have one solution that sacrifices the referential integrity to help you achieve what you want.
You can keep a GUID data type as the primary key in all three log tables. In the Notification table you just need to add one foreign key column which won't point to any particular table. So only you know that it is a foreign key, SQL Server doesn't and it doesn't enforce referential integrity. In this column you store the GUID of notification. The notification can be in any of the three logs but since the primary key of all three logs is GUID, you can store the key in your Notification table.
Also you add another column in the Notification table to tell which of the three logs this GUID belongs to. Now you can uniquely know which row in the required log table you have to go to in order to find this notification info.
The problem is that you have three separate log tables. Instead you should have had only log table which would have an extra column specifying what kind of logging is it. That way you'd have only one table - referential integrity would have stayed and design would have been simple.
Use one table holding notification ids. Each of the three original tables hold subtypes of notification ids with FKs on their own ids to that table. Search re subtyping/subtables in databases. This is a standard design pattern/idiom.
(There are entities. We group them conceptually. We call the groups kinds or types. We say of a particular entity that it is a whatever kind or type of entity, or even that it "is a" whatever. We can have groups that contain all the entities of another group. Since the larger is a superset of the smaller we say that the larger type is a supertype of the smaller type, and the smaller is a subtype of the larger.)
There are idioms you can use to help constrain your tables declaratively. The main one is to have a subtype tag in the supertype table, and even also in the subtype tables (where each table has only one tag value).
I eventually faced two main options:
Following the last suggestion in this answer.
Choosing a less normalized structure for the database, AKA fake/no FK's. To be precise, in my case it would be my second option above with fake FK's.
I chose option #2 as a DBA whom I consulted enlightened me on the idea that database normalization should be done according to possible structure breakage. In my case, although notifications are created based on logs, these FK's are not necessary for querying the notifications nor for querying the log and the app do not have to ensure this relationship for a proper functioning. Thus, following option #1 may be "over-normalization".
Thanks all for your answers and comments.
Suppose that we have a "Cash Transactions" table, as its name implies it keeps the track of cash I/O. There might be a case in the future where we are having cash transactions about completely different concepts. Since we model these "concepts" in the database, we would like to have some form of identifiability between transactions and the concepts. In other words, I would like to know from which table and which entry a money transaction comes from.
I've come up with two solutions; first one involving a meta-data column identifying the table and a foreign key, second one with foreign keys as many as it needs and only using the non-null one so we know by the merit of the column name which table to look for it.
I reckon they both will work but they feel hacky. It feels like there is an elegant solution but its not these two. Or perhaps I hit the limit of relational DB design and I should resolve to NoSQL? How to do this properly?
You should use a link table, the cash transactions should be unaware of what table to link to.
I'm really new to database design, as I will now demonstrate:
I have an MS Sql database that I need to add a table to. The table contains information that pertains to another table. However, there are no candidates for primary keys (all fields can be duplicates). The only thing the table will ever be used for is to keep records that may be required for a certain kind of query, and they can be retrieved super-easily using a field that my other tables also contain (but never uniquely).
Specifically, my main table has a bunch of chemistry records. Each chemistry record is associated with another set of records called quality-control records (in my second table). They are associated by a field called "BatchID". The super-easy part is that I can say, "get all records with this BatchID" and get exactly what I need. But there can be multiple instances of any BatchID in both tables (in fact, there usually are), so I'd need to jump through hoops to link them. In a more general sense, in theory, is it OK to have a table floating around not attached to anything?
The overwhelmingly simple solution is to just put the quality control in the db with no relationships to the chemistry table. I'd need to insert at least one other table to relate it to anything else, maybe more, and the only reason for complicating my life like that is that I don't want to violate some important precept of database design.
My question is, is it ever OK to just have a free-floating table in a database? Or is that right out?
Thanks for any help.
In theory, it's ok to have a table that doesn't have any foreign key constraints. But the table you describe (both tables you describe) should probably have a foreign key that references the table of batches. We'd expect the table of batches to have "BatchID" as its primary key.
The relational model requires tables to have at least one candidate key. It's almost always a bad idea to have a SQL table that doesn't have a candidate key.
I want to make sure this is the best way to handle a certain scenario.
Let's say I have three main tables I will keep them generic. They all have primary keys and they all are independent tables referencing nothing.
Table 1
PK
VarChar Data
Table 2
PK
VarChar Data
Table 3
PK
VarChar Data
Here is the scenario, I want a user to be able to comment on specific rows on each of the above tables. But I don't want to create a bunch of comment tables. So as of right now I handled it like so..
There is a comment table that has three foreign key columns each one references the main tables above. There is a constraint that only one of these columns can be valued.
CommentTable
PK
FK to Table1
FK to Table2
FK to Table3
VarChar Comment
FK to Users
My question: is this the best way to handle the situation? Does a generic foreign key exist? Or should I have a separate comments table for each main table.. even though the data structure would be exactly the same? Or would a mapping table for each one be a better solution?
My question: is this the best way to handle the situation?
Multiple FKs with a CHECK that allows only one of them to be non-NULL is a reasonable approach, especially for relatively few tables like in this case.
The alternate approach would be to "inherit" the Table 1, 2 and 3 from a common "parent" table, then connect the comments to the parent.
Look here and here for more info.
Does a generic foreign key exist?
If you mean a FK that can "jump" from table to table, then no.
Assuming all 3 FKs are of the same type1, you could theoretically implement something similar by keeping both foreign key value and referenced table name2 and then enforcing it through a trigger, but declarative constraints should be preferred over that, even at a price of slightly more storage space.
If your DBMS fully supports "virtual" or "calculated" columns, then you could do something similar to above, but instead of having a trigger, generate 3 calculated columns based on FK value and table name. Only one of these calculated columns would be non-NULL at any given time and you could use "normal" FKs for them as you would for the physical columns.
But, all that would make sense when there are many "connectable" tables and your DBMS is not thrifty in storing NULLs. There is very little to gain when there are just 3 of them or even when there are many more than that but your DBMS spends only one bit on each NULL field.
Or should I have a separate comments table for each main table, even though the data structure would be exactly the same?
The "data structure" is not the only thing that matters. If you happen to have different constraints (e.g. a FK that applies to one of them but not the other), that would warrant separate tables even though the columns are the same.
But, I'm guessing this is not the case here.
Or would a mapping table for each one be a better solution?
I'm not exactly sure what you mean by "mapping table", but you could do something like this:
Unfortunately, that would allow a single comment to be connected to more than one table (or no table at all), and is in itself a complication over what you already have.
All said and done, your original solution is probably fine.
1 Or you are willing to store it as string and live with conversions, which you should be reluctant to do.
2 In practice, this would not really be a name (as in string) - it would be an integer (or enum if DBMS supports it) with one of the well-known predefined values identifying the table.
Thanks for all the help folks, i was able to formulate a solution with the help of a colleague of mine. Instead of multiple mapping tables i decided to just use one.
This mapping table holds a group of comments, so it has no primary key. And each group row links back to a comment. So you can have multiple of the same group id. one-many-one would be the relationship.