What exactly is a foreign key? - database

Ok. So I know what a primary key in DB is. If you have a table in a database, a primary key is a single value that is unique to each row in your table. For example:
id | name | whatever
-------------------------
1 Alice ....
2 Bob ....
45 Eve ....
988 .... ....
So I need a good, simple example to explain what exactly a foreign key is. Because I just don't get it :)
Edit: OK it's pretty easy, I guess I was over-complicating the problem.
So one final question, the only restriction on foreign keys is that it they are a valid primary key value in the table I am referring to?

A foreign key is a field that points to a primary key of another table.
Example:
Table Name - Users
UserID UserName UserRoleID
1 JohnD 1
2 CourtneyC 1
3 Benjamin 2
Table Name - UserRoles
UserRoleID Desc
1 Admin
2 Moderator
You can see that Users.UserRoleID is a foreign key which points to the primary key UserRoles.UserRoleID
The use of foreign keys makes setting up relationships on other tables simple, allowing you to link together the data of multiple tables in a nice way:
Example:
SELECT
a.UserID,
a.UserName,
b.Desc as [UserRole]
FROM
Users a INNER JOIN
UserRoles b ON a.UserRoleID = b.UserRoleID
Output would then be:
UserID UserName User Role
1 JohnD Admin
2 CourneyC Admin
3 Benjamin Moderator

Let's say you have another field, which is the home city:
id | name | city
-------------------------
1 Alice San Francisco
2 Bob New York
45 Eve New York
988 Bill San Francisco
Now, it does not make sense to repeat the same cities in many rows. This could lead you to typos, excessive space usage, difficulties to bring up results among other problems. So you use a foreign key:
id | name | fk_city
-------------------------
1 Alice 1
2 Bob 2
45 Eve 2
988 Bill 1
home city table:
id | name
-------------------------
1 | San Francisco
2 | New York
Hope it makes things clearer for you. :-)
Update: about your final question: Yes. :-)

A foreign key is a column in one table that should uniquely identify something in another table. Thus, the values should correspond to primary keys in that other table.
For example, if you have a table of students taking courses, every record would include a student id and a course id. These are foreign keys into a student table (where there is one record for each student id), and a courses table (where there is one record for each course id).
Referential integrity means that all your foreign keys actually correspond to primary keys in these target tables. For example, all the student ids and course ids in your registration table correspond to real student ids and course ids.

id | name | whatever | countryid
-------------------------------------
1 Alice .... 13
2 Bob .... 42
45 Eve .... 1
988 .... .... 2
id | countryid
----------------
1 Japan
2 Spain
13 Norway
42 Italy
The foreign key points from the person table (first) to a row in the country table (second)

A foreign key is the primary key from another table stored on your table. Say you have a table of customers and a table of orders. The CustomerId is likely the primary key on the customer table, and the OrderId is likely the primary key on the order table. But on the order table you need to know the customer for this order, no? Therefore you need to store the CustomerId on the order table. In this case the CustomerId on the order table is a foreign key.
I would point out that there is no requirement that a primary key (and therefore a foreign key) be a single column. It's simpler, sure. But I've worked on enterprise systems where the primary key was 11 columns long, and I'm sure there are examples longer than that. That is, you needed to know the value for 11 different columns before you can uniquely identify the row.

In a relational database a one-to-many relationship is implemented by having the child table reference the ID of the parent table. The parent ID in the Child table is called a Foreign Key as it references a primary key of another table.

A foreign key is a field that references another table in the database. For example, suppose you had 2 tables, PERSON and ADDRESS. There is a field in PERSON called ID and a field in ADDRESS called PERSON_ID. You would make PERSON_ID refer to PERSON.ID as a foreign key. What this means is that you can't have an address that is not connected to a person, since the value in the ADDRESS.PERSON_ID field must exist in the table PERSON.

using your table example, assume you have another table:
cartid | id | itemid
-----------------------
100 1 abc
101 1 cde
in this table, the primary key is the cartid, the foreign key is the id, which would be linked to your first table. user 1 has two carts, each cart having one item each.
a foreign key is the what you use to link two or more tables that have related information to each other.

Related

What is the purpose of a database table that contains only primary and foreign keys?

I'm trying to understand a simple music database design. There are some tables that contain only foreign keys and a primary key. I'm not sure how and when to use these tables or what to insert into them. The design looks like this:
Track:
id primary key
title
duration
live-performance (true or false)
year
Artist:
id primary key
name
ArtistTrack:
id primary key
artistID
trackID
Album:
id primary key
title
AlbumTrack:
id primary key
albumID
trackID
track-number
Genre:
id primary key
name
GenreTrack:
id primary key
genreID
trackID
For example, if I insert a track into the Track table and an artist into the Artist table, what should I then insert into the ArtistTrack table? I assume the attributes in the ArtistTrack tables are numbers identical to the primary keys in their respective tables?
I have seen several designs that are similar to this and I just don't get it. I know a foreign key links tables together. Could someone give me an example on how to use these tables?
The ArtistTrack table is a junction table, a classic way of representing an M:N relationship. If you put a reference to the trackId in the Artist table, it would mean that each artist can have (at most) one track. Assuming this is not a database to manage one hit wonders, that would be wrong. If you put a reference to the artistId in the Track table, each track could be composed by (at most) one artist. If you want to allow collaborations in this database, that would also be wrong.
The solution is to have an ArtistTrack table, which, as you noted, just has references to relevant artists and tracks. E.g.:
-- Insert the track:
INSERT INTO Track VALUES (1, 'some track', 10, false, 1999);
-- Insert a couple of artists:
INSERT INTO Artist VALUES (1, 'Jay');
INSERT INTO Artist VALUES (2, 'Silent Bob');
-- Make them collaborate on this track
INSERT INTO ArtistTrack VALUES (1, 1, 1);
INSERT INTO ArtistTrack VALUES (2, 2, 1);
There are a number of ways to make any database, and there aren't universal rules for this type of thing. Depending on the needs of the application, and the type of database software you are using, you might store data differently.
That being said, if you were looking to use this table design for a traditional relational database, you would probably do it as the following for the Artist-Track situation you mentioned:
If you are entering a new Track with a new Artist you would first enter the new Track with its associated data (title, duration, live-performance, year) then enter the Artist with its associated data (name). Then, to associate the Artist with the Track you would add a row in the ArtistTrack table that contains the primary_id (random and unique key) and the foreign keys of artistID and trackID. The foreign keys in ArtistTrack are the primary keys of the Track and Artist you just entered.
I am guessing the reasoning that your tables are structured as you described was for allowing the potential of a track having many Artists, and an Artist having many Tracks. Because of the Many-To-Many relationship between those two entities, there is a bridging or association table (ArtistTrack) that allows easy lookup on Tracks and Artists to find the associations between them.
The tables you are wondering about, like "GenreTrack" or "AlbumTrack", are used to store how e.g. Tracks and Genres are combined respectively how any Track fits onto which album. This is a common way of storing so called n:m relationships in a normalised database.
Let's look at GenreTrack as an example.
Say Table "Genre" contains the following:
id | name |
G1 | Rock |
G2 | Blues |
G3 | Pop |
and "Tracks" looks like this:
id | title | duration | live | year
T1 | "Pictures of You" | 7:68 | FALSE | 2006
T2 | "A Song for the Lovers | 5:25 | FALSE | 1999
Now you want to be flexible with how you assign the genres to the tracks.
Maybe you want to have "Song for the Lovers" beeing a "Pop" as well as a "Rock" song. Genres are debatable to some degree after all.
So, for that, a simple foreign key in the "Tracks" table won't help here. You need to store this separately. And this is where the "GenreTrack" table comes into play.
It keeps all combinations of "Tracks" and "Genres".
Entries in it could look like this:
id | genreID |trackID
GT1 | G1 | T1
GT2 | G3 | T2
GT3 | G1 | T2
Now, you might be wondering, why this table got it's own "id" column. In fact, it is not necessary for making this a normalised table, since you could use "genreID" and "trackID" to form a compound primary key. However, some database frameworks apparently don't support compound keys and require a surrogate key for all tables, which is likely the reason for this "id" column here.
Selecting this data is straight forward:
SELECT t.title, t.year, g.name as genre_name
FROM
"Tracks" t
left outer join "GenreTrack" gt
on t.id = gt."trackID"
left outer join "Genre" g
on gt."genreID" = g."id";
Resulting in this :
Title | Year | Genre
"Pictures of You" | 2006 | Rock
"A Song for the Lovers" | 1999 | Rock
"A Song for the Lovers" | 1999 | Pop
Hope that gives you an idea on these m:n tables.
this is pretty straight forward.
I've attached a sample data, please refer and let me know if it helps.

Handling multi-select list in database design

I'm creating a clinic management system where I need to store Medical History for a patient. The user can select multiple history conditions for a single patient, however, each clinic has its own fixed set of Medical History fields.
For example:
Clinic 1:
DiseaseOne
DiseaseTwo
DiseaseThree
Clinic 2:
DiseaseFour
DiseaseFive
DiseaseSize
For my Patient visit in a specific Clinic , the user should be able to check 1 or more Diseases for the patient's medical history based on the clinic type.
I thought of two ways of storing the Medical History data:
First Option:
Add the fields to the corresponding clinic Patient Visit Record:
PatientClinic1VisitRecord:
PatientClinic1VisitRecordId
VisitDate
MedHist_DiseaseOne
MedHist_DiseaseTwo
MedHist_DisearThree
And fill up each MedHist field with the value "True/False" based on the user input.
Second Option:
Have a single MedicalHistory Table that holds all Clinics Medical History detail as well as another table to hold the Patient's medical history in its corresponding visit.
MedicalHistory
ClinicId
MedicalHistoryFieldId
MedicalHistoryFieldName
MedicalHistoryPatientClinicVisit
VisitId
MedicalHistoryFieldId
MedicalHistoryFieldValue
I'm not sure if these approaches are good practices, is a third approach that could be better to use ?
If you only interested on the diseases the person had, then storing the false / non-existing diseases is quite pointless. Not really knowing all the details doesn't help getting the best solution, but I would probably create something like this:
Person:
PersonID
Name
Address
Clinic:
ClinicID
Name
Address
Disease:
DiseaseID
Name
MedicalHistory:
HistoryID (identity, primary key)
PersonID
ClinicID
VisitDate (either date or datetime2 field depending what you need)
DiseaseID
Details, Notes etc
I created this table because my assumption was that people have most likely only 1 disease on 1 visit, so in case there's sometimes several, more rows can be added, instead of creating separate table for the visit, which makes queries most complex.
If you need to track also situation where a disease was checked but result was negative, then new status field is needed for the history table.
If you need to limit which diseases can be entered by which clinic, you'll need separate table for that too.
Create a set of relational tables to get a robust and flexible system, enabling the clinics to add an arbitrary number of diseases, patients, and visits. Also, constructing queries for various group-by criteria will become easier for you.
Build a set of 4 tables plus a Many-to-Many (M2M) "linking" table as given below. The first 3 tables will be less-frequently updated tables. On each visit of a patient to a clinic, add 1 row to the [Visits] table, containing the full detail of the visit EXCEPT disease information. Add 1 row to the M2M [MedicalHistory] table for EACH disease for which the patient will be consulting on that visit.
On a side note - consider using Table-Valued Parameters for passing a number of rows (1 row per disease being consulted) from your front-end program to the SQL Server stored procedure.
Table [Clinics]
ClinicId Primary Key
ClinicName
-more columns -
Table [Diseases]
DiseaseId Primary Key
ClinicId Foreign Key into the [Clinics] table
DiseaseName
- more columns -
Table [Patients]
PatientId Primary Key
ClinicId Foreign Key into the [Clinics] table
PatientName
-more columns -
Table [Visits]
VisitId Primary Key
VisitDate
DoctorId Foreign Key into another table called [Doctor]
BillingAmount
- more columns -
And finally the M2M table: [MedicalHistory]. (Important - All the FK fields should be combined together to form the PK of this table.)
ClinicId Foreign Key into the [Clinics] table
DiseaseId Foreign Key into the [Diseases] table
PatientId Foreign Key into the [Patients] table
VisitId Foreign Key into the [Visits] table

foreign key in databases and creating a join table

I have question regarding the Associative or the join table we create for the relationship between two entities.
I know the that the foreign key can be NULL in the join table.But should the join table only contain the relationships.As in if in a bank there is a customer(key-id) and a loan(key-id) entity.Let borrow be the relationship between it.Now suppose there are customers who "haven't taken a loan".
So should i take those customers id in the borrow table and the corresponding foreign key for loan-id to be NULL.Or i shouldn't take those customers in the borrow table.
And what can be a good primary key for the join table.And is the primary key for the join table required.
You are right having a join table between customer and loan.
But you do not need to do anything in this table until there is an actual borrow.
Your primary key for the borrow table should be a composite primary key. Made of customer_id and load_id
Customer
customer_id | name | ...
1 | Jon | ...
2 | Harry | ...
Loan
load_id | amount | ...
1 | 1000 | ...
2 | 2000 | ...
Borrow
customer_id | load_id
1 | 1
1 | 2
In this example you can see that Jon has to loans and respectivley there are two records in the borrow table. Harry is a customer, but he has no loan and so there is no record in the borrow table for him.
Every table (base or query result) has a parameterized statement (aka predicate):
customer [customer_id] has taken out loan [loan_id]
Borrows(customer_id,loan_id)
When you plug in a row like VALUES (customer_id,loan_id) (8,3) you get a statement (aka proposition):
customer 8 has taken loan 3
The rows that make true statements go in the table. The rows that make false statements stay out of the table. So every row that fits in a table makes a statement whether it is in it or not!
The table predicate corresponds to an application relationship wher parameters correspond to columns. A row says something about those values and about identified application entities via them.
You pick the application relationships ie table predicates. Then you look at an application situation and put every true row into the tables. Or you look at the tables and see what things are true (per present rows) and false (per absent rows).
Queries also have predicates per their conditions and their logical and relational operators. And their results hold the rows that make them true.
So when someone hasn't taken a loan their customer_id doesn't appear in any row in Borrows. And when a loan has not been taken by anyone then its loan_id doesn't appear in any row of Borrows.
If a column can be null then its table's predicate often looks like:
[name] IS NULL AND [customer_id] identifies a customer
OR [name] IS NOT NULL
AND [customer_id] identifies a customer
AND customer [customer_id] is named [name]
Customer(customer_id NOT NULL,name NULL)
(Using NULL in other ways gets even more complicated. And we try to remove NULLs in queries as near to when they're introduced as possible.)
We determine candidate keys like usual and pick one as a primary key as ususal. Eg the key for Borrows is (customer_id,name) because that set's values are unique and there is no smaller unique subset. But determining keys involves columns that are UNIQUE NOT NULL (which PRIMARY KEY is just a synonym for as a constraint). But we don't ever need to use NULL in a column because instead of a predicate/table like the above we can have two:
[customer_id] identifies a customer
Customer(customer_id NOT NULL)
customer [customer_id] is named [name]
Customer(customer_id NOT NULL,name NOT NULL)
Just like always a row goes in a table if and only if it makes a true statement.
See this.

Is it possible to have a foreign key which isn't covering the whole primary key of the referenced table?

I have two tables:
Table A: with a composite primary key.
CommonID (PK) | Month (PK) | some data...
-----------------------------------------
1 | May 2011 | ...
1 | June 2011 | ...
2 | May 2011 | ...
2 | June 2011 | ...
Table B: referencing to table A
ID (PK) | A_CommonID (FK)| some data...
-----------------------------------------
... | 1 | ...
... | 2 | ...
As you can see table B isn't referencing the whole primary key but it will definitely always reference a unique entry in table A because there is a global value for the specified used month which will be used for A.Month in SQL-queries.
Now my question is, is that allowed or am I violating several rules of Database design?
I would really appreciate a nice answer because I will use it in the final document which I have to write for my bachelor's degree.
Thanks a lot in advance!
No, this is not allowed.
If you have a composite primary key consisting of more than one column, your foreign keys must also be composite and reference all the columns involved in your primary key.
A foreign key must reference the primary key, the whole key and nothing but the key (so help you Codd) :-)
What you might be able to do is to have a separate unique index on that A_CommonID column in Table A so that your Table B can reference that unique index (instead of the PK).

is there any way to make a two dimensional database on vb 2010?

Im preparing a program for my teacher friends using vb 2010 express. They keep records about their students. I prepared a database that contains a table named "Mystudents". It has columns like "studentId , Name, Surname, etc.." . My problem starts here. Each student attends lots of lessons during a year. I must keep "which lessons they attended", "when they attended", "which topic done in the lessons" for each students. for example
Id: 104
Name : Jason
Surname : Black
Class : 10A
on 12.04.2011 he attended math lesson and they do trigonomtry
on 14.04.2011 he attended physics lesson and they do gravity
.......
.......
Id: 105
Name : Marry
Surname : Steward
Class : 11B
on 02.04.2011 she attended math lesson and they do trigonomtry
on 14.04.2011 he attended physics lesson and they do gravity
.......
........
i mean i have a list of data for each record of databese. Please halp me..?
In a relational database design, you would typically include a "relation table" to keep track of this:
--------------
| Student |
--------------
| 1
|
| 0..*
--------------------
| Students_Lessons |
--------------------
| 0..*
|
| 1
--------------
| Lesson |
--------------
The Student table have StudentID as primary key, the Lesson table has LessonID as primary key, and the Students_Lessons table contains the two columns StudentID and LessonID which will link students to lessons.
As you see in the database design above, each record in the Student table can be linked to zero or more records in the Students_Lessons table. The same goes for the Lesson table; each record can be linked to zero or more records in the Students_Lessons table. However, each record in the Students_Lessons table must be linked to exactly one record in Student, and one record in Lesson.
If each student may attend each lesson once only, you can extend the Students_Lessons table with additional columns for any other information that you require, otherwise it's probably better to extend the data model with additional tables for storing more information.
If I'm not wrong, you're looking for a 1-N and M-N relationship.
Best suggestion would be you should learn more about database design. You can start googing what's a 1-N and M-N relationship in relational databases.
You're looking to support that in VB, but this is out of .NET scope but it's a database design thing :)
What is the question?
Try to write down all the properties/entities you'd like to store in your database. Based on that, you can perform some normalization to achieve the optimal database structure.
For example: a student's got an id, name and surname. Those properties belong together in a students table.
Further; a student will follow lessons. But this is not a 1:1 relationship. So in the first place you'll get a table 'lessons' where all lessons are defined, after that you'll get a StudentsLessons table where the link between lessons and the students attended is created.
I would use 3 tables.
students
student_id student name .. etc ..
1 jane doe
2 jack dee
lessons
lesson_id lesson_name .. etc..
1 gravity 101
2 hard maths
3 hampsters
student_lessons
student_id lesson_id
1 1
1 2
1 3
Google-ing database design stuff like "normal form", "1 to many", "many to many" and "many to 1" relationships would help you here.
CREATE TABLE student
(
id INT NOT NULL PRIMARY KEY,
firstName NVARCHAR(200),
lastName NVARCHAR(200),
)
CREATE TABLE subject
(
id INT NOT NULL PRIMARY KEY,
subjectName NVARCHAR(200)
)
CREATE TABLE class
(
id INT NOT NULL PRIMARY KEY,
subjectId INT NOT NULL
FOREIGN KEY
REFERENCES subject,
classDate DATE,
topic NVARCHAR(200)
)
CREATE TABLE student_class
(
studentId INT NOT NULL
FOREIGN KEY
REFERENCES student,
classId INT NOT NULL
FOREIGN KEY
REFERENCES class,
PRIMARY KEY (studentId, classId)
)

Resources