There are two tables in my database one is named as folder and second is User. Now user has few rights to these folders that which folder will be visible to user and which will not visible to him. By using 3rd Normalization form i normalize my table. My Question to you is that i want your suggestion that have i normalized table correctly or not, Secondly can i normalize it more or not? My this attached image below will help you to know normalization that i did.
Thank you!
Yes! You successfully achieved 3NF, since every non-key attribute (in your case folder right) depends on the whole key (user_id, folder_id) and there are no transitive dependencies.
Actually, your table is in 6NF too, since you cannot decompose the table further into its projections without losing information. :)
Since I am not aware of any normal form beyond 6NF, I'd say you cannot normalize it further.
Assuming you've replaced that table in the middle with the one at the bottom then, yes, you've achieved 3NF (a).
You can normalise it more (there's a 4th and 5th normal form) but it's pretty rare to have to go that far.
That doesn't mean more optimisation isn't possible. If the only two states you have are visible and non-visible, you can get rid of the states altogether and treat the existence of a row in the many-to-many table as indicating visible. That way, your final table would simply be:
user id folder id
======= =========
1 1
2 2
with the missing entries 1/2 and 2/1 indicating non-visible.
(a) A good way to remember 3NF is that every non-key column should depend on the key, the whole key and nothing but the key, so help me, Codd, a bit of DBA humour which explains why they don't get out much :-) The explanation is a little simplied since true normalisation works on candidate keys, not just (for example) primary keys.
That means your middle table wasn't 3NF because its key would have been userid/folderid and folder name only depends on part of that key.
Related
I have a table with only two attributes (DeliveryPerson and DeliveryTime). Each person can deliver a “product” at a specific Delivery Time. As you can see below John for example has delivered three products at different delivery times.
According to my task, I must put this table in 3NF, but I am confused because I cannot set “deliveryPerson” as a primary key because there are repeated values in this column. Is there any way of setting up this table to satisfy 3NF? If that is not possible, is it correct to have a table like this in a DB without a Primary key?
Thank you very much!
Normalisation is not about adding Primary Keys to a table where you've already decided the columns, it's about deciding what tables and columns you need in the first place. The inability to define a Primary Key on this table is the problem you've been asked to solve; the solution will involve creating new tables.
Rather than looking at the table, look at the data you're trying to model:
There are four (and probably any number of) delivery people
Each delivery person can have one or more (maybe even zero) delivery times
A normalised database will represent each of those separately. I'll leave the details for you to work out, rather than feeding you the full answer.
There are plenty of tutorials available which will probably explain it better than me.
I'm not a DB design expert and have what I suspect is a newbie question. If it's better answered in another forum (or from a simple reference), please let me know.
Given a Table "Recordings" and a table "Artists". Both tables have primary keys suitably defined. There is a relationship that we want to express between these tables. Namely, An artist could have many recordings, or no recordings. A recording can only have 1 or 0 artists. (We could have some obscure recording with no known artist).
I thought the solution to this problem was to have a foreign key pointing to artist in the Recording Table. This field could be null (the recording has no artist). Additionally, we should define cascading deletes, such that if an artist is deleted, all recordings that have a foreign referring to that artist, now have a foreign key of null. [I really do want to leave the actual recording when you delete the artist. Our real tables are not "artists" and "recordings" and a recording can exist without an artist].
However, this is not how my colleagues have set things ups. There is no foreign key column in 'Recordings', but rather an extra table 'RecordingArtist_Mapping' with two columns,
RecordingKey ArtistKey
If an Artist (or Recording) is removed, the corresponding entry in this mapping table is removed. I'm not saying this is wrong, just different to what I expected. I have certainly seen a table like this when one has a many-many relationship, but not the relationship I described above.
So, my questions are:
Have you heard of this way of describing the relationship?
Is there a name for this type of table?
Is this a good way to model the relationship or would be be better off with the foreign key idea I explained? What are the pros/cons of each?
My colleagues pointed out that with the foreign key idea, you could have a lot of nulls in the Recordings Table, and that this violates (perhaps just in spirit?) one of the Five Normal Forms in Relational Database Theory. I'm way out of my league on this one :) Does it violate one of these forms? Which one? How? (bonus points for simple reference to "Five Normal Forms" :) ).
Thank you for your help and guidance.
Dave
On the face of it, this it simply an intersection table that allows a many-to-many relationship between two other tables.
When you find that you need one of these it is generally a good idea to consider "what does this table mean", and "have I included all the relevant attributes".
In this case the table tells you that the artist contributed to the recording in some way, and you might then consider "what was the nature of the contribution".
Possibly that they played a particular instrument, or instruments. Possibly they were a conductor.
You might then consider whether people other than artists made a contribution to the recording -- sound engineer? So that leads you to consider whether "artist" is a good table at all, because you might instead want a table that represents people in general, and then you can relate any of them to a recording. Maybe you even want to record the contribution of a non-person -- the London Symphony Orchestra, for example.
You can even have entities that contribute in multiple ways -- guitarist, vocalist, and producer? You might also consider whether there ought to be a ranking of the contributions so that they are listed in the correct order (which may be a contractual issue).
This is exactly the way that contributions to written works are generally modelled -- here is a list of the contributor codes used in the ONIX metadata schema for books, as an illustrative industry example: https://www.medra.org/stdoc/onix-codelist-17.htm
Your solution with a foreign key in Recording is absolutely correct from the Normalization Theory point of view, it does not violate any significant normal form (the most important one are Third Normal Form, and Boyce-Codd Normal Form, and neither of them is violated).
Moreover, a part being conceptually simpler and safe, from a practical point of view it is more efficient, since it in general reduces the number of joins that must be done. In may opinion, the pros are greater than the cons.
Yes, that's a viable setup, this is called vertical partitioning.
Basically, you move your artist field from recording to another table with the primary key referencing that on recording.
The benefit is you don't necessarily have to retrieve artists with doing lookups on recordings, the drawback is that if you still have to, if would be somewhat slower, because of an extra join.
Have you heard of this way of describing the relationship?
Yes, it's a many to many relationship. A recording can have more than one artist. An artist can have more than one recording.
Is there a name for this type of table?
I call them junction tables.
Is this a good way to model the relationship or would be be better off with the foreign key idea I explained? What are the pros/cons of each?
A junction table is required in a many to many relationship. When you have a one to many relationship, you would use a foreign key in the many table.
As far as 4th level and 5th level database normalization, this A Simple Guide to Five Normal Forms in Relational Database Theory article from 1982 explains the different levels.
Under fourth normal form, a record type should not contain two or more independent multi-valued facts about an entity.
Fifth normal form deals with cases where information can be reconstructed from smaller pieces of information that can be maintained with less redundancy.
I remember the first 3 levels of normalization with this sentence.
I solemnly swear that the columns rely on the key, the whole key, and nothing but the key, so help me Codd.
Thank you for your knowledge in advance. I am studying for the Microsoft Technology Exam and one of the practice questions is :
Creating a primary key satisfies the first normal form. True or False?
I personally think it is False because the first normal form is to get rid of duplicate groups. But there is a sentence in the text (Database Fundamentals, Exam 98-364 by Microsoft Press) that says the following:
"The first normalized form (1NF) means the data is in an entity format, which basically means that the following three conditions must be met:
• The table must have no duplicate records. Once you have defined a primary key for the table, you have met the first normalized form criterion."
Please help me understand this, please explain like I am five. Thanks.
I can't explain this stuff to a five year old. I've tried. But I may be able to shed a little light on the subject. The first thing you need to know is that there have been multiple definitions of 1NF over the years,and these definitions sometimes conflict with each other. This may well be the source of your confusion, or at least some of it.
A useful thing to know is what purpose Ed Codd had in mind when he first defined it. Ed Codd defined First Normal Form, which he called Normal Form, back in the paper he published in 1970. His purpose in that paper was to demonstrate that a database built along relational lines would have all the expressive power that existing databases had. Existing databases often dealt with a parent that owns a set of children. For example, if the parent data item contains data about a student, each child might contain data about one course the student is taking.
You can actually define such a structure in terms of mathematical relations by allowing one of the attributes of a relation to be itself a relation. I'm going to call that "nesting" relations, although I don't recall what Ed Codd called it. In defining the relational data model, which is closely patterned after mathematical relations, Ed Codd wanted, for a variety of reasons, to forbid such a structure. his reasons were mostly practical, to make it more feasible to build the first relational database.
So he devoted some of his paper to proving that you could limit attributes to "simple" values without reducing the expressive power of the relational data model. I'm going to sidestep what "simple" means for the moment, although it's worth coming back to. He called this limitation "normal form". Once a second normal form was discovered, normal form got renamed to first normal form.
When it came time to build a relational database the engineers decided on a data structure called a "table". (I don't know the actual history, but this is approximate). A table is a logical structure made up of rows and columns. It can be thought of as an array of records, where each record represents a row, and all the records have the same header.
Now, if you want such a structure to represent a relation, you have to throw in a restriction that will prevent two rows with exactly the same values. If you had such duplicates, this would not represent a relation. A relation, by definition, has distinct elements. This is where primary keys come in. A table with a primary key can't have duplicate rows, because it can't have duplicate keys.
But I'm not done yet. You didn't ask this, but it has come up a thousand times in stack overflow, so it's worth putting in here. A designer can defeat Ed Codd's original intent by creating a column that contains text that, in turn contains comma separated values. In Codd's original formulation, a list of values is not "simple".
This is enormously appealing to the neophyte because it looks simpler and more efficient, to store a table with comma separated values than to create two tables one for parent records and the other for child records, and to join them when they are both needed for one query. Joins are not simple to the neophyte, and they do take some computer resources.
The CSV in a column design turns out to be an unfortunate design in nearly every case. The reason is that certain queries that could have been done real fast via an index now require a full table scan. This can turn seconds into minutes or minutes into hours. It's much more expensive than a join.
So you have to teach the newbies why keyed access to all data is a good thing, and this means you have to teach them what 1NF is really all about. And this can be as hard as teaching a five year old. Newbies are typically less ignorant than five year olds, but they tend to be more stubborn.
First Normal Form is mostly a matter of definition rather than design. In a relational system, the data structures are relation variables. Since a relation always consists of unique tuples a relation variable will always have at least one candidate key. By convention we call one key per relation a "primary" key so in a relational database the primary key requirement is always satisfied.
Similarly, in a relational database all attributes contain values which are identifiable by name, not by positional index and so the issue of "repeating groups" does not apply. The concept of a "repeating group" exists in some non-relational systems and that was what Codd was referring to when he originally defined 1NF.
However, problems of the interpretation of 1NF arise because most modern DBMSs are not truly relational even though people try to use them like relational systems. Since SQL DBMSs are not relational, how are we to interpret relational concepts like 1NF in a SQL DBMS?
The essense of 1NF is that each table must have a key and that tuples consist of single values for each attribute. Most SQL-based systems don't support the concept of "repeating groups" (multiple values in a single attribute position) so it is usually safe to say that if a SQL table has a key and does not permit nulls in any attribute position then it is "relational" and satisfies the spirit of 1NF.
A primary key must be completely unique. So once this is part of a record, it is distinct from any other record.
eg.
Record 1
---------
KEY = 1
Name = Fred Boggs
Age = 84
Record 2
--------
KEY = 2
Name = Fred Boggs
Age = 84
These 2 records are different because the field KEY is different.
Therefore although the rest of the data is the same, it meets the requirements for 1NF.
You are only quoting a fragment of the text Database Administration Fundamentals. A more complete quote is:
The first normalized form (INF) means the data is in an entity format,
which basically means that the following three conditions must be met:
• The table must have no duplicate records. [...]
• The table also must not have multivalued attributes, meaning that
you can't combine in a single column multiple values that are
considered valid for a column. [...]
• The entries in the column or attribute must be of the same data
type.
(The history of the term "1NF" is full of confusions, vagueness and changes. But here's what this text says.)
Let me join the party ;)
For a question "is this relation in 1NF" to have a meaning, you first need a relation. And for your table to be a relation, you need a key. A table without any keys is not a relation.
Why? Because relation is a set (of tuples/rows) and a set cannot contain same element more than once (otherwise it would be multiset), which is ensured by a key.
Once you have a relation by having a key, you can see if all your attributes are atomic, and if they are, you have yourself a 1NF.
So the answer to...
Creating a primary key satisfies the first normal form. True or False?
...is False. You do need a key, but you also need atomicity.
Currently I'm confused with the whole normalisation thing for databases.
Can anyone help me figure out how to go to 1NF following to 3NF? My 1NF version looks like this though not sure this is correct..:
http://imgur.com/i7JTcXw,qPMtPdq
The link contains both the UNF and my version of the 1NF table.
Having just looked here for the definitions :) :http://www.studytonight.com/dbms/database-normalization.php.
1nf requires that each row can reliably identified. In your table you do not have a clear primary key. Each row can be identified by flight number and part of the status fields (arrival or departure) and the scheduled time
I can see that your table violates 2nf because your status fields seem to contain multiple pieces of information and is not a of a single data type,ie it tells you 2 pieces of information: arrival/departure and the time. There is also an implied value in the actual status of 'Cancelled' which would not have an associated time.
3nf eliminates dependencies between fields that are not part of the primary key, in your case I would point the finger at the from and to fields: their values could be part of a lookup table as each flight number is normally dedicated to a particular route and as such repeating them in this table is unnecessary duplication. For example you seem to be going to 'Sidney,' but really you are going to 'Sidney' (no comma) so a query for all flights going to Sidney is going to find QF431.
Another reason for removing them is that as it stands the QF431 departure and destination airports could change between rows which could violate the rule that each flight number is unique to a flight path. With the current structure this rule could not be enforced by the dbms
Hi I have been thinking for hours about a database normalization problem that I am trying to solve. In my problem I have a composite primary key and data in one of the columns of the key has multiple values. Multiple values within one of the columns of the primary key is the major problem. I want to know whether in first normal form only repeating groups other than primary key will be removed or primary key column having multiple values will also be removed. Still may be its nebulous for you people to understand. So I am posting screenshot of the table:
http://tinypic.com/view.php?pic=ev47jr&s=5
(Kindly open the image above to see the table)
Here the question I wanna ask is that whether in first normal form only column number 4,5,6,7 will be removed or column number 2 will also be removed (Since it also contains multiple values)?
If I don't remove 2nd column then it won't come in 1NF, but if I remove it too, then it will go to 3NF directly. Help?
Thank you.
Here the question I wanna ask is that whether in first normal form
only column number 4,5,6,7 will be removed or column number 2 will
also be removed
All columns containing multiple values will be changed. That includes column 2.
If I don't remove 2nd column then it won't come in 1NF, but if I
remove it too, then it will go to 3NF directly.
Normalization doesn't work like this:
Determine the structure that is in 1NF, but is not yet in 2NF.
Determine the structure that is in 2NF, but is not yet in 3NF.
Determine the structure that is in 3NF, but is not yet in BCNF.
Determine the structure that is in BCNF, but is not yet in 4NF.
Determine the structure that is in 4NF, but is not yet in 5NF.
Determine the structure that is in 5NF, but is not yet in 6NF.
The relational model doesn't say that for every relation R that is in 1NF, there exists a decomposition that is in 2NF, but is not yet in 3NF. It just doesn't say that, but this is a common misunderstanding.
In practice, it's not unusual to remove a partial key dependency to get to 2NF, and find the results to be in 5NF.