Database relation many to many - database

alt text http://produits-lemieux.com/database.jpg
This is basicly my database structure
one product (let say soap) will have many retail selling size
1 liter
4 liters
20 liters
In my "produit" database I will have the soap item (id #1)
In the size database i will have many size availible :
1liter
4liter
20liter
How not to duplicate the product 3 time with a different size... i like to be able to have check box in the product size of all the size available in the database and check if yes or no (boolean)
The answer a got is perfect, but how to have the option like that :
soap [x] 1 liter , [ ] 4 liter , [x] 20 liter

I'm not sure I understand your exact scenario, but to create a many-to-many relationship, you simply create a "relationship table", in which you store id's for the two records you want to link.
Example:
Products
********
ProductID (PK)
Price
Retailers
*********
RetailerID (PK)
Name
ProductRetailerRelationships
****************************
ProductID
RetailerID

A many-to-many relationship is almost always modeled using an intermediate table. For your example,
Product
--------
prod_numero
...
Size
--------
size_numero
...
Product_Size
--------
prod_numero
size_numero
...
The Size table would contain particular sizes (say, 10 liter) and the Product_Size table creates a Product and Size pairing.

You Will need an Intermediary, or "Join" Table
ProductSizes
.......................
ProductID
SizeID
One record for each product-size combination

Based on the answers, here is the database tables layout as proposed, it look complicated to me, but are you sure it is the way to do this, the BEST solution ?
alt text http://produits-lemieux.com/database2.jpg

Related

How to edit Relationship properties to only use part of the CDM Identifier in LDM?

I'm creating a conceptual data model for a simplified web store using Power Designer.
I'm having trouble specifying the relation between an Order and a Receipt. I would like a receipt to only have a part of the order's identifier in its primary key in the logical model (more specifically, only order_id). I am unable to achieve this by tweaking the relationship properties (see the screenshots bellow; the problematic relationship is marked with a green arrow).
Should I simply omit the relation in the conceptual model?
Conceptual data model
Logical data model
EDIT
If perhaps it wasn't clear how I envisioned my tables…
User
username
password
mail
first_name
last_name
address
hacker123
greenGrass
david.norton#gmail.com
David
Norton
West Shire 40, 1240 Neverland
musicman100
SuperPassword
john.stewart#gmail.com
John
Stewart
Strange Alley 50, 1250 Outer Space
Product
product_id
name
description
price_per_unit
unit_of_measure
supply
1
Tooth Brush 100
NULL
5.99
piece
200
2
Super Paste 200
For sparkling smiles
7.99
piece
50
Order
order_id
username
product_id
amount
50
hacker123
1
2
50
hacker123
2
1
51
musicman100
1
5
Receipt
receipt_id
order_id
12
50
13
51
EDIT #2
I just realised that I should probably break up Order into two tables! One to track which products are on a particular order, and another to track who placed the order.
Perhaps I could even split the Order table into 3 parts
Order(order_id, order_time)
ProductsPerOrder(order_id, product_id, amount)
OrdersPlaced(order_id, username)
You have a contradiction... One part says that Order is identified by User+Product+Order; the other says that Order has its own identifier order_id.
I guess the second one is correct, with the usual design that Order has an id.
And you need to change the relationships in the CDM, between Order, and User/Product, to uncheck the Dependent property. These links are just mandatory, not dependent (which would mean that Order is defined relatively to User+Product).
p.s. the same holds for Receipt, which has its own identifier.
You can edit the relationship in the logical model!
If you click on a relationship, a Relationship properties dialog appears. There's a tab called Joins. This is where you can specify which columns to refer to with the relationship.

Simple database design - some columns have multiple values

Caveat: very new to database design/modeling, so bear with me :)
I'm trying to design a simple database that stores information about images in an archive. Along with file_name (which is one distinct string), I have fields like genre and starring where each field might contains multiple strings (if an image is associated with multiple genres, and/or if an image has multiple actors in it).
Right now the database is just a single table keyed on file_name, and the fields like starring and genre just have multiple comma-separated values stored. I can query it fine by using wildcards and like and in operators, but I'm wondering if there's a more elegant way to break out the data such that it is easier to use/query. For instance, I'd like to be able to find how many unique actors are represented in the archive, but I don't think that's possible with the current model.
I realize this is a pretty elementary question about data modeling, but any guidance anyone can provide or reading you can direct me to would be greatly appreciated!
Thanks!
You need to create extra tables in order to stick with the normalization. In your situation you need 4 extra tables to represent these n->m relations(2 extra would be enough if the relations were 1->n).
Tables:
image(id, file_name)
genre(id, name)
image_genres(image_id, genre_id)
stars(id, name, ...)
image_stars(image_id, star_id)
And some data in tables:
image table
id
file_name
1
/users/home/song/empire.png
2
/users/home/song/promiscuous.png
genre table
id
name
1
pop
2
blues
3
rock
image_genres table
image_id
genre_id
1
2
1
3
2
1
stars table
id
name
1
Jay-Z
2
Alicia Keys
3
Nelly Furtado
4
Timbaland
image_stars table
image_id
star_id
1
1
1
2
2
3
2
4
For unique actor count in database you can simply run the sql query below
SELECT COUNT(name) FROM stars

How do I normalize data in a database?

I'm in an intro to database management course and we're learning about normalizing data (1NF, 2NF, 3NF, etc.) and I'm super confused on how to actually go about and do it. I've read up on this, consulted various sites and youtube videos and I still can't seem to get it to click. I am using Microsoft Access 2013 if that's of any help.
This is the data I'm working with.
Thanks.
Edit1: Alright, I think I have the tables set up correctly. But now I'm having trouble actually inputting data to go from one table to the next. Here's my relationship table.
On a very basic level, any repeating values in a table are candidates for normalization. Duplicated data is usually a bad idea. Say you needed to update a patient's surname - you now have to update all the occurrences in this table, and possibly many others throughout the rest of the database. Much better to store each patient's details in one place only.
This is where normalization comes in. Looking down the columns, you can see that there are repeating values for data about dentists, patients and surgeries, so we should normalize towards having tables for each of these entities, as well as the original table that contains appointments, giving you four tables in total.
Extract the entities out into their own tables, and give each row a primary (unique) key - just use an incrementing integer for now. (Edit: as suggested in the comment we could use the natural keys of PatientNo, StaffNo and SurgeryNo instead of creating surrogates.)
Then, instead of each patient's name and number appearing multiple times in the appointments table, we just reference the key of the master record in the Patient table. This is called a foreign key.
Then, do the same for Dentist and Surgery.
You will end up with tables looking something like this:
APPOINTMENT
AppointmentID DentistID PatientID AppointmentTime SurgeryID
----------------------------------------------------------------
1 1 1 12 Aug 03 10:00 1
2 1 2 ... 2
3 2 3 ... 1
4 2 3 ... 1
5 3 2 ... 2
6 3 4 ... 3
DENTIST
DentistID Name StaffNo
--------------------------------------
1 Tony Smith S1011
2 Helen Pearson S1024
3 Robin Plevin S1032
PATIENT
PatientID Name PatientNo
---------------------------------------
1 Gillian White P100
2 Jill Bell P105
3 Ian MackKay P108
4 John Walker P110
SURGERY
SurgeryID SurgeryNo
-------------------------
1 S10
2 S15
3 S13
The first step is to data modelling and denormalization is to understand your data. Study it an understand the domain "objects" or tables that exist within your model. That will give you an idea of how to start. Sometimes a single table or query sample is not enough to fully understand the database, but in your case, we can use the sample data and make some assumptions.
Secondly, look for repeated / redundant data. If you see copies of names, there is a good chance that is a candidate for a foreign key. Our assumption tells us that STAFF_NO is a primary key candidate for DENTIST because each unique STAFF_NO correlates to a unique DENTIST_NAME, so I see a good candidate DENTIST table (STAFF_NO, DENTIST_NAME)
Example in some table of SURGERY:
ID STAFF_NO DENTIST_NAME
1 1 Fred Sanford
2 1 Fred Sanford
3 3 Lamont Sanford
4 3 Lamont Sanford
Why store these over and over? What happens when Fred says "But my correct name is Fred G Sanford", so you have to update your database. In the current table, you have to update the name is many rows. If you had normalized it, you'd have a single location for the name, in the DENTIST table.
So I can take the unique dentists and store them in DENTIST
create table DENTIST(staff_no integer primary key, dentist_name varchar(100));
-- One possible way to populate our dentist table is to use a distinct query from surgery
insert into DENTIST
select distinct staff_no, dentist_name from surgery;
STAFF_NO DENTIST_NAME
1 Fred Sanford
3 Lamont Sanford
SURGERY table now points to DENTIST table
ID STAFF_NO
1 1
2 1
3 3
4 3
And you can now create a view, VIEW_SURGERY to join the DENTIST_NAME back in to satisfy the needs of typical queries.
select s.id, d.staff_no, d.dentist_name
from surgery s join dentist d
on s.staff_no = d.staff_no -- join here
So now a unique update to DENTIST, by the dentist primary key will update a single row.
update dentist set name = 'Fred G Sanford' where staff_no = 1;
Add query view will show the updated name for N rows:
select * from view_surgery
ID STAFF_NO DENTIST_NAME
1 1 Fred G Sanford
2 1 Fred G Sanford
3 3 Lamont Sanford
4 3 Lamont Sanford
In short, you are removing redundancy.
This is just a sample, and one way to do it. Manual normalization like this is not as common when you have modelling tools, but the point is, we can look at data, spot redundancies and factor those redundancies into new tables, and relate those new tables by foreign keys and joins, then build views to represent the original data.

Best way to store results data in database? [duplicate]

This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 9 years ago.
I have results data like this:
1. account, name, #, etc
2. account, name, #, etc
...
10. account, name, #, etc
I have approximately 1 set of results data generated each week.
Currently it's stored like so:
DATETIME DATA_BLOB
Which is annoying because I can't query any of the data without parsing the BLOB into a custom object. I'm thinking of changing this.
I'm thinking of having one giant table:
DATETIME RANK ACCOUNT NAME NUMBER ... ETC
date1 1 user1 nn #
date1 2 user2 nn #
...
date1 10 userN nn #
date2 1 user5 nn #
date2 2 user12 nn #
...
date2 10 userX nn #
I don't know anything about database design principles, so can someone give me feedback on whether this is a good approach or there might be a better one?
Thanks
I think it is ok to have a table like that, if there are not one-to-many relationships. In that case, it would be more efficient to have multiple tables like in my example below. Here are some general tips as well:
Tip: Good practice My professor told me that it's always good to have an "ID" column, which is a unique number identifier for each item in the table (1, 2, 3… etc.). (Perhaps that was the intent of your "Number" column.) I think SQLite forces each table to have an ID column anyways.
Tip: Saving storage space - Also, if there is a one-to-many relationship (example: one name has many accounts) then it might save space to have a separate table for the accounts, and then store the ID of the name in the first table- so that way you are storing many ints instead of duplicate strings.
Tip: Efficiency - Some databases have specific frameworks designed to handle relationships such as many-to-one or many-to-many, so if you use their framework for that (I don't remember exactly how to do it) it will probably work more efficiently.
Tip: Saving storage space - If you make your own ID column it might be a waste if it automatically includes an "ID" column anyways - so you might want to check for that possibility.
Conceptual Example: (Storing multiple accounts for the same name)
Poor Solution:
Storing everything in 1 table (inefficient, because it duplicates Bob's name, rank, and datetime):
ID NAME RANK DATETIME ACCOUNT
1 Bob 1 date1 bob_account_1
2 Joe 2 date2 user2_joe
3 Bob 1 date1 bob_account_2
4 Bob 1 date1 bobs_third_account
Better Solution: Having 2 tables to prevent duplicated information (Also demonstrates the usefulness of ID's). I named the 2 tables "Account" and "Name."
Table 1: "Account" (Note that NAME_ID refers to the ID column of Table 2)
ID NAME_ID ACCOUNT
1 1 bob_account_1
2 2 user2_joe
3 1 bob_account_2
4 1 bobs_third_account
Table 2: "Name"
ID NAME RANK DATETIME
1 Bob 1 date1
2 Joe 2 date2
I'm not a database expert so this is just some of what I learned in my internet programming class. I hope this helps lead you in the right direction in further research.

Save 2-dimensional spreadsheet data to database

I have several tables of weight/price pairs with the following format:
(example for two tables, I have several such tables)
table 1: table 2:
weight price weight price
5 7 5 4
10 9 10 7
15 14 15 11
20 15 20 14
... ...
Each weight/price table has the same amount of rows but it should be possible to edit the values when needed. It should also be possible to add new tables at a later time with as little trouble as possible.
I would like for the table to be almost like an attribute of another entity. Is there a way to do such a thing? Some suggest to simply store the files on the disk and read them when needed, but that solution is not perfect for me, since I would like to edit the values on occasion.
What is the correct structure to store such data?
In SQL you could simply do something like this (assuming "Product" is the thing that identifies each set of weight/price pairs and assuming weight determines price):
CREATE TABLE ProductWeightPrice
(Product VARCHAR(20) NOT NULL,
Weight INT NOT NULL,
Price INT NOT NULL,
PRIMARY KEY (Product,Weight));
If you need more help then please say what DBMS you are using.
Would it work for you to have a "table number" field? So it becomes:
product_id,
table_number,
weight,
price
With a primary key on product_id, table_number.
Now it isn't a "table number" any more, but more a "pair number", or similar, so you could change the names to be more meaningful.

Resources