Example:
Master
ID
Student
1
Cindy
2
Barbie
Detail
ID
ID_FK
Subject
1
1
Math
2
1
Science
3
1
English
4
2
English
5
2
History
Scenario: if i update Barbie to have three subjects, Math, science and english, should i delete all her records first then add the new ones or is there any other way to do this. Thanks.
Related
Turns out I have these three tables:
Recipes: id(primary key), name, time, difficulty, description, photo_url, amount_pax
Recipe_type: id(primary key), name
Ingredients: id(primary key), name
In my exercise the three tables have to be yes or yes.
I need to put the type of recipe and the ingredients in the recipe table but I don't know how to do it because I could put the id_type_recipe and the id_ingredients as a foreign key but the problem is that in several recipes said id can be repeated because if tomato has the id 1 in another recipe there may be tomato and this is not feasible because they are primary keys in another table that cannot be repeated... how could I put the type of recipe and ingredients in the recipe table?
I attach my E-R diagram.
I started by entering the ingredient id and the recipe type id but I realized that they were going to be repeated and this is not possible as they are primary keys in other tables.
To handle Recipe_Type you can add Recipe_TypeId as an additional column to the Recipes table.
But for Ingredients, expecting each Recipe may need several Ingredients (of different quantities) you will need an additional table.
Let's say you have a Recipes for Tomato Soup and Chicken Noodle Soup with Recipes records like this:
ID
Name
Time
Difficulty
Description
Photo_url
amount_pax
Recipe_Type_ID
1
Tomato Soup
3:00
1
Take tomatos...
http://...
2
1
2
Chicken Noodle Soup
20:00
2
Cook Chic...
http://...
4
1
and a Recipe_type like this:
ID
Name
1
Soup
And Ingredients like this:
ID
Name
1
Tomatoes
2
Milk
3
Salt
4
Chicken
5
Water
6
Noodles
Now you also need an intersection table (named, say RecipeIngredient) that would look something like this:
RecipeID
IngredientID
Qty
Measure
1
1
8
Each
1
2
1.5
Cups
1
3
1
Tsp
1
5
1.5
Cups
2
3
1
Tsp
2
4
2
Lbs
2
5
4
Cups
2
6
1
Lbs
This new table will have a compound key like this: (RecipeID, IngredientID)
So a friend of mine is on a new project. The customer wants to use SQL Server roles (admin, poweruser, superior, general, guest) and wants to have restriction on which columns to a table can be pulled back when the user has this role.
Imagine a table below
ID Make Model Type ProdCost ROI Frequency RecallDate
-------------------------------------------------------------------
1 70 This 1 $12 2 2
2 71 That 2 $12 3 2
3 72 Sparrow 3 $12 2 3
4 72 Duck 4 $12 2 N/a
5 76 Fellon 5 $12 4
Admin role can retrieve all columns
Poweruser can retrieve all columns except ID
Guest can get ID MAKE Model columns
General gets ID, ProdCost, ROI, Frequency, RecallDate
How do we make this work with Entity Framework? Ideas? The client wants to use views but it all seems messy
I'm in an intro to database management course and we're learning about normalizing data (1NF, 2NF, 3NF, etc.) and I'm super confused on how to actually go about and do it. I've read up on this, consulted various sites and youtube videos and I still can't seem to get it to click. I am using Microsoft Access 2013 if that's of any help.
This is the data I'm working with.
Thanks.
Edit1: Alright, I think I have the tables set up correctly. But now I'm having trouble actually inputting data to go from one table to the next. Here's my relationship table.
On a very basic level, any repeating values in a table are candidates for normalization. Duplicated data is usually a bad idea. Say you needed to update a patient's surname - you now have to update all the occurrences in this table, and possibly many others throughout the rest of the database. Much better to store each patient's details in one place only.
This is where normalization comes in. Looking down the columns, you can see that there are repeating values for data about dentists, patients and surgeries, so we should normalize towards having tables for each of these entities, as well as the original table that contains appointments, giving you four tables in total.
Extract the entities out into their own tables, and give each row a primary (unique) key - just use an incrementing integer for now. (Edit: as suggested in the comment we could use the natural keys of PatientNo, StaffNo and SurgeryNo instead of creating surrogates.)
Then, instead of each patient's name and number appearing multiple times in the appointments table, we just reference the key of the master record in the Patient table. This is called a foreign key.
Then, do the same for Dentist and Surgery.
You will end up with tables looking something like this:
APPOINTMENT
AppointmentID DentistID PatientID AppointmentTime SurgeryID
----------------------------------------------------------------
1 1 1 12 Aug 03 10:00 1
2 1 2 ... 2
3 2 3 ... 1
4 2 3 ... 1
5 3 2 ... 2
6 3 4 ... 3
DENTIST
DentistID Name StaffNo
--------------------------------------
1 Tony Smith S1011
2 Helen Pearson S1024
3 Robin Plevin S1032
PATIENT
PatientID Name PatientNo
---------------------------------------
1 Gillian White P100
2 Jill Bell P105
3 Ian MackKay P108
4 John Walker P110
SURGERY
SurgeryID SurgeryNo
-------------------------
1 S10
2 S15
3 S13
The first step is to data modelling and denormalization is to understand your data. Study it an understand the domain "objects" or tables that exist within your model. That will give you an idea of how to start. Sometimes a single table or query sample is not enough to fully understand the database, but in your case, we can use the sample data and make some assumptions.
Secondly, look for repeated / redundant data. If you see copies of names, there is a good chance that is a candidate for a foreign key. Our assumption tells us that STAFF_NO is a primary key candidate for DENTIST because each unique STAFF_NO correlates to a unique DENTIST_NAME, so I see a good candidate DENTIST table (STAFF_NO, DENTIST_NAME)
Example in some table of SURGERY:
ID STAFF_NO DENTIST_NAME
1 1 Fred Sanford
2 1 Fred Sanford
3 3 Lamont Sanford
4 3 Lamont Sanford
Why store these over and over? What happens when Fred says "But my correct name is Fred G Sanford", so you have to update your database. In the current table, you have to update the name is many rows. If you had normalized it, you'd have a single location for the name, in the DENTIST table.
So I can take the unique dentists and store them in DENTIST
create table DENTIST(staff_no integer primary key, dentist_name varchar(100));
-- One possible way to populate our dentist table is to use a distinct query from surgery
insert into DENTIST
select distinct staff_no, dentist_name from surgery;
STAFF_NO DENTIST_NAME
1 Fred Sanford
3 Lamont Sanford
SURGERY table now points to DENTIST table
ID STAFF_NO
1 1
2 1
3 3
4 3
And you can now create a view, VIEW_SURGERY to join the DENTIST_NAME back in to satisfy the needs of typical queries.
select s.id, d.staff_no, d.dentist_name
from surgery s join dentist d
on s.staff_no = d.staff_no -- join here
So now a unique update to DENTIST, by the dentist primary key will update a single row.
update dentist set name = 'Fred G Sanford' where staff_no = 1;
Add query view will show the updated name for N rows:
select * from view_surgery
ID STAFF_NO DENTIST_NAME
1 1 Fred G Sanford
2 1 Fred G Sanford
3 3 Lamont Sanford
4 3 Lamont Sanford
In short, you are removing redundancy.
This is just a sample, and one way to do it. Manual normalization like this is not as common when you have modelling tools, but the point is, we can look at data, spot redundancies and factor those redundancies into new tables, and relate those new tables by foreign keys and joins, then build views to represent the original data.
Sometimes I am having a hard time seeing a difference between an entity and a column when I am starting to make a diagram. I don't know when it is supposed to be a entity or a column. For example, in some game if you have a user and that user can play by itself or it can play in the group. Would you make that two different entities User and GroupUser ?
Also, for example if the User has levels, status and badges they earn which is part of the game. Would these be entities also or they would just be in one entity which would be part of the User ?
Entity could be a Person (e.g. Student), Place (e.g. Room Name), Object (e.g. Books), Abstract Concept (e.g. Course, Order) that could be represented in your database and normally could become a Table in your Database.
Column(s) on the other hand is/are the attribute(s) of your Entity.
So, in your case you have a User entity and the possible columns or attributes (or fields) are
UserID, UserLevel, UserStatus, Badges, PlayStatus (values could be individual or group).
Your Badges although is a column could turn into Entity if it violates the Normalization rules.
For example if you have this Table for User:
Table: Users
UserID UserName UserStatus PlayStatus Badges
------ -------- ---------- ---------- ------
1 Surefire Active Single Private, Warrior, Platoon Leader
2 FastMachine Active Group Private, Warrior
3 BeatTheGeek Inactive Group Private
The Badges here violates the 1NF (1st Normal Form) in Normalization rules which says that there should be no repeating groups or in this case no Multi-valued columns. So, this could be normalized like:
Table: Users
UserID UserName UserStatus PlayStatus
------ -------- ---------- ----------
1 Surefire Active Single
2 FastMachine Active Group
3 BeatTheGeek Inactive Group
Table: Badges
BadgeID BadgeName
------ --------
1 Private
2 Indie
3 Warrior
4 Platoon Leader
5 Colonel
6 1 Star General
7 2 Star General
8 3 Star General
9 4 Star General
10 5 Star General
11 Hero
Table: UserBadgesHistory
UserID BadgeID ReceiveDate
------ -------- -----------
1 1 12/01/2013
1 3 12/05/2013
1 4 1/5/2014
2 1 2/5/2014
2 3 2/10/2014
3 2 11/10/2013
In general, an entity has multiple columns (i.e. attributes) of its own, and a column (or attribute) does not.
In your example, if the only data you're interested in storing is a User's current level, then level is unlikely to be an entity. This is because it would have only a single attribute of name/number. If you wanted to find all Users currently at level 4, you would simply do a query with level = 4.
On the other hand, if you had a reason to add additional data about the level, such as what abilities are associated with that level or the date a given User achieved the level, then you would want to make Level a separate entity.
A Level entity would have an ID, a number or name, and whatever other attributes you need as data.
ID | Prerequisite | Ability
----+--------------+--------------
1 | NULL | May gain foos
2 | Gain 10 foos | May gain bars
3 | Gain 20 bars | 30 free foos
In a fully normalized state, you would have another entity called UserLevel in which you would store data about, for example, when a certain User gained a level.
The UserLevel entity would contain the LevelID and the UserID as foreign keys (links back to the other entities), and a DateAchieved column for when the User achieved the level.
LevelID | UserID | DateAchieved
---------+--------+-------------
1 | 1 | 2014-02-01
1 | 2 | 2014-02-01
2 | 1 | 2014-02-05
3 | 1 | 2014-02-09
2 | 2 | 2014-02-11
4 | 1 | 2014-02-13
This shows User 1 and User 2 starting at Level 1 on the same day and leveling up at different rates.
This question asks how to select a user's rank by his id.
id name points
1 john 4635
3 tom 7364
4 bob 234
6 harry 9857
The accepted answer is
SELECT uo.*,
(
SELECT COUNT(*)
FROM users ui
WHERE (ui.points, ui.id) >= (uo.points, uo.id)
) AS rank
FROM users uo
WHERE id = #id
which makes sense. I'd like to understand what the performance tradeoffs would be, between this approach, or by modifying the db structure to store a calculated rank (I guess that would require massive changes every time there's a rank change), or any other approaches that I'm too newb to think of. I'm a db noob.
The performance tradeoff would basically be what you described:
If you modified the structure to store a rank, queries would be very, very simple and fast. However, this would require some overhead any time "points" changed, as you'd have to verify that the rank hasn't changed. If the ranking had changed, you'd have to do multiple updates.
This causes more work (with the potential for bugs) at every update/insert. The tradeoff is very fast reads. If you're typical usage is very few modifications compared to millions of reads, AND you found this query to be a bottleneck, it might be worth considering reworking this. However, I would avoid the added complexity and maintainability headaches unless you truly found this to be a problem, since the current solution requires less storage, and is very flexible.
The link you reference is a MySQL question. If the original database had been Oracle the accepted answer would be to use an analytic function, which does scale, very nicely:
SQL> select id, name, points from users order by id
2 /
ID NAME POINTS
---------- ---------- ----------
1 john 4635
3 tom 7364
4 bob 234
6 harry 9857
8 algernon 1
9 sebastian 234
10 charles 888
7 rows selected.
SQL> select name, id, points, rank() over (order by points)
2 from users
3 /
NAME ID POINTS RANK()OVER(ORDERBYPOINTS)
---------- ---------- ---------- -------------------------
algernon 8 1 1
bob 4 234 2
sebastian 9 234 2
charles 10 888 4
john 1 4635 5
tom 3 7364 6
harry 6 9857 7
7 rows selected.
SQL> select name, id, points, dense_rank() over (order by points desc)
2 from users
3 /
NAME ID POINTS DENSE_RANK()OVER(ORDERBYPOINTSDESC)
---------- ---------- ---------- -----------------------------------
harry 6 9857 1
tom 3 7364 2
john 1 4635 3
charles 10 888 4
bob 4 234 5
sebastian 9 234 5
algernon 8 1 6
7 rows selected.
SQL>
Does not the 'where' portion of that query internally require reading the entire table? I understand about premature optimization. Academically, it seems that this wouldn't scale further than a few thousand rows.