Resolve many to many relationship - database

Does anyone have a process or approach to use for determining how to resove a many-to-many relationship in a relational database? Here is my scenario. I have a group of contacts and a group of phone numbers. Each contact can be associated with multiple phone numbers and each phone number can be associated with multiple contacts.
A simple example of this situation would be an office with two employess (e1 & e2), one main voice line (v1), one private voice line (v2). e1 is the CEO so they have thier own private voice line, v1, but they can also be reached by calling the main line, v2, and asking for the CEO. e2 is just an employee and can only be reached by calling v2.
So, e1 should be related to v1 & v2. e2 should be related to v2. Conversly, v1 should be related to e1 and v2 should be related to e1 & e2.
The goal here is to ge able to run queries like "what numbers can e1 be reached at" and "what employees can be reached at v2", etc.. I know the answer will involve an intermediate table or tables but I just can't seem to nail down the exact architecture.

You don't need any temp tables for the query. There is an intermediary table for the mapping.
numbers_tbl
-----------
nid int
number varchar
employees_tbl
-----------
eid int
name varchar
employee_to_phone_tbl
-----------
eid int
nid int
How can I call Bob?
select *
from employees_tbl e
inner join employee_to_phone_tbl m
on e.eid = m.eid
inner join numbers_tbl n
on m.nid = n.nid
where e.name = 'Bob'
Who might pickup if I call this number?
select *
from numbers_tbl n
inner join employee_to_phone_tbl m
on m.nid = n.nid
inner join employees_tbl e
on e.eid = m.eid
where n.number = '555-5555'

Employees:
eID, eName
1, e1
2, e2
PhoneNumbers:
pID, pNumber
1, v1
2, v2
EmployeePhones:
eID, pID
1, 1
1, 2
2, 2
then you inner join. if you need to find out what number(s) e1 can be reached at (t-sql):
SELECT E.eName, P.pNumber
FROM dbo.Employees E
INNER JOIN dbo.EmployeePhones EP ON E.eID = EP.eID
INNER JOIN dbo.PhoneNumbers P ON EP.pID = P.eID
WHERE E.eName = 'e1'
I believe this should work (testing it right now...)
EDIT: Took me a few minutes to type up, sorry for duplication...

Others have explained the schema, but I'm going to explain the concept. What they're building for you, the table named EmployeePhones and employee_to_phone_tbl, is called an Associative Entity, which is a type of Weak Entity.
A Weak Entity does not have its own natural key and must instead be defined in terms of its foreign keys. An Associative Entity exists for the sole purpose of mapping a many-to-many relationship in a database that does not support the concept. Its primary key is the grouped foreign keys to the tables it maps.
For further information on relational theory, see this link

Normalize
Best Practices on Referential Integrity
Check this - http://statisticsio.com/Home/tabid/36/articleType/ArticleView/articleId/327/Need-Some-More-Sample-Databases.aspx

After just a little more thought, here is what I came up with. It probably goes along with the approach AviewAnew is thinking of.
employees
id (index)
name
numbers
id (index)
number
relations
employees.id (index)
numbers.id (index)
employees
1 : e1
2 : e2
numbers
1 : v1
2 : v2
relations
1 : 1
1 : 2
2 : 1
Is this the best/only approach?

Related

Returning from a join the first result of one column based one a second column

I need some help to improve part of my query. The query is returning the correct data, I just need to exclude some extra information that I don't need.
I believe that one of the main parts that will change is:
JOIN TBL_DATA_TYPE_RO_BODY TB ON TB.FK_ID_TBL_FILE_NAMES=VMI.ID_TBL_FILE_NAMES
In this part, I have, for example, 2 FK_ID_TBL_FILE_NAMES, it will return 2 results from TBL_DATA_TYPE_RO_BODY.
The data that I have is (I excluded some extra columns):
If I have 2 or more equal MAG for the same field "ONLY_FIELD_NAME" I should return only the first one (I don't care about the others one). I believe that this is a simple case for Group by, but I am having trouble doing the group by on the join.
My ideas:
Use select top (i.e. here)
Use first valeu (i.e. here)
What I have (note the 2 last lines):
Freq|Mag|Phase|Date|ONLY_FILE_NAME
1608039|767|3234|37:00.0|RO_Mass_Load_4b
1608039|781|3371|44:00.0|RO_Mass_Load_4b
1608039|788|3138|37:00.0|RO_Mass_Load_4b
1608039|797|3326|44:00.0|RO_Mass_Load_4b
1608039|808|3117|37:00.0|RO_Mass_Load_4b
1608039|808|3269|44:00.0|RO_Mass_Load_4b
What I would like to have (note the last line):
Freq|Mag|Phase|Date|ONLY_FILE_NAME
1608039|767|3234|37:00.0|RO_Mass_Load_4b
1608039|781|3371|44:00.0|RO_Mass_Load_4b
1608039|788|3138|37:00.0|RO_Mass_Load_4b
1608039|797|3326|44:00.0|RO_Mass_Load_4b
1608039|808|3117|37:00.0|RO_Mass_Load_4b
Note that the mag field is coming from my JOIN.
Ideas? Any help?
In case you wanna see the whole code is:
SELECT TW.CURRENT_MEASUREMENT as Cycle_Current_Measurement,
TW.REF_MEASUREMENT as Cycle_Ref_Measurement,
CONVERT(REAL,TT.CURRENT_TEMP) as Cycle_Current_Temp,
CONVERT(REAL,TT.REF_TEMP) as Cycle_Ref_Temp,
TP.TYPE as Cycle_Type, TB.FREQUENCY as Freq,
TB.MAGNITUDE as Mag,
TB.PHASE as Phase,
VMI.TIME_FORMATTED as Date,
VMI.ID_TBL_FILE_NAMES as IdFileNames, VMI.ID_TBL_DATA_TYPE_RO_HEADER as IdHeader, VMI.*
FROM VW_MAIN_INFO VMI
JOIN TBL_DATA_TYPE_RO_BODY TB ON TB.FK_ID_TBL_FILE_NAMES=VMI.ID_TBL_FILE_NAMES
LEFT JOIN TBL_POINTS_AND_CYCLES TP ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TP.FK_ID_TBL_DATA_TYPE_RO_HEADER
LEFT JOIN TBL_POINTS_AND_MEASUREMENT TW ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TW.FK_ID_TBL_DATA_TYPE_RO_HEADER
LEFT JOIN TBL_POINTS_AND_TEMP TT ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TT.FK_ID_TBL_DATA_TYPE_RO_HEADER
Try something like this. the partition by is like a group by; it defines groups over which row_number will auto-increment an integer by 1. The order by tells row_number which rows should have a lower number. So in this example, the lowest date will have RID = 1. Then subquery it, and select only those rows which have RID = 1
select *
from (select RID = row_number() over (partition by tb.Magnitude order by vmi.time_formatted)
from ...<rest of your query>) a
where a.RID = 1

DBMS: Relational Algebra Execution Plan Cost Calculation

I have been trying the final days to come with a solution to the following question.
Lets suppose that we have the following two tables.
Film(ID',Title,Country,Production_Date)
Actor(ID',Name,Genre,Nationality)
Cast(Actor_ID',Film_ID',Role)
Given information:
Film holds N(film)=50.000 records, r(film)=40bytes, sequential organized, index on PK
Actor holds N(actor)=200.000 records r(actor)=80bytes,heap organized, index on PK
Cast holds N(cast)=100.000 records,r(cast)=25 bytes, heap organized, No INDEXES
The execution tree and relation expression for an execution plan is in the following picture:
For the lower level join between cast & film I'm calculating the followings:
Block Nested Loop Join : Bcast x Bfilm
Index Nested Loop Join : Bcast + Ncast x Cfilm
I'm keeping the smallest value which is given with an INLJ.
Question:
Now how can I calculate the size of the joined table and the new r which is the size of a record on the new joined table in order to proceed and calculate the upper level join between the already joined table with table actor after having calculated the cost B in blocks that join operation will take?
I assume you want to do a natural join on FILM.ID = CAST.FILM_ID and CAST.FILM_ID is a foreign key referencing FILM.ID.
1) Size of one row:
A join of Film and Cast results in tuples of the form
[FILM_ID, TITLE, COUNTRY, PRODUCTION_DATE, ACTOR_ID, ROLE].
Hence the row size should be something like
R(FILM JOIN CAST) = R(FILM) + R(CAST) - R(FILM_ID)
since the FILM_ID is the only column which is shared.
2) Number of rows:
N(FILM JOIN CAST) = N(CAST)
As there is exactly one row in FILM for every row in CAST.

How to depict table joins in diagram?

I have several tables (let's call them A, B, C, D, etc.), where each has multiple columns (A has columns a1, a2, etc.).
I'd like to diagrammatically represent the following:
Inner join A and B where A.a1 = B.b1, then check the value of B.b2. If B.b2 = 1, inner join the result with C where A.a1 = C.c1. If B.b2 = 2, inner join the result with D where A.a1 = D.d1.
I've tried to use a traditional flowchart, but I'd like to keep track of the tables and their columns. With a database schema diagram, however, I'm not sure how to depict things like logical conditions.
What's the best way to do this?

join type for comparing two rows

Just can't seem to figure this out, although it seems rather simple.
Table: Attd (...short for Attendance)
Visit Person Status Date
1 1 Member 2011-01-31
2 1 Member 2011-02-05
3 2 Member 2011-02-05
4 3 Not 2011-01-07
5 1 Not 2011-01-25
6 1 Not 2011-01-20
7 1 Not 2011-02-03
The data belongs to visits to a location by individuals, which includes if they had a membership (Status column).
How would you select visits that took place one week before someone became a member (Same person: Status=Not --> Status=Member)? [Output row 5 above.]
For example,
Person 2 became a member without visiting before, because they had no Status=Not before they joined.
Person 3 visited as a non-member and never came back.
And, person 1 visited as a non-member (Status=Not on 2011-01-25) and became a member within one week (Status=Member on 2011-01-31).
Preliminary work:
a. Pretty sure the answer contains a self join
b. The dateAdd function help satisfy the one week before condition
Try this simple query:
SELECT t1.Person
FROM
(SELECT *
FROM attd
WHERE status = 'member')T1
INNER JOIN
(SELECT *
FROM attd
WHERE status = 'not')T2 ON T1.Person = T2.Person
WHERE datediff(dd,T2.Date,T1.Date)<=7
You can find a working example on SQLFiddle.
Hope this helps you and feel free to contact me if you have any more questions.
First use aggregation to calculate the member date for each person. This is presumably the minimum date where there is a status of 'Member'. Then do a join back to the table to get earlier visits:
select *
from (select a.person, min(date) as MemberDate
from Attendance a
where status = 'Member'
group by a.person
) m join
Attendance aprev
on m.person = aprev.person and
datediff(d, aprev.date, MemberDate) <= 7 and
aprev.status = 'Not';
Strictly speaking the last condition 'aprev.status = 'Not' is unnecessary, because those are the only statuses before the member date. However, I think it clarifies the intent of the query.

How to save Large relational data

Hope and pray that you all must well.
I have a scenario, in which i have to write a very large set of relational/combinational data, I am looking for a implementation technique which must be super fast. Its something like an expert system in AI.
I have 4 entities, Questions, Options, Benefits and Scenarios:
Each question can have multiple options
Each option can relate to single question
On any combination of options a benefit is allocated, the allocation is called scenario
a scenario can related to any number of options
a scenario can relate to any number of benefits
Each benefit can be included in multiple scenarios
Now for instance we look for an example:
We have 4 questions, q1, q2, q3, q4
q1 have 3 options q1o1, q1o2, q1o3
q2 have 4 options q2o1, q2o2,q2o3,q2o4
q3 have 5 options q3o1, q3o2,q3o3,q3o4, q3o5
q4 have 2 options q4o1, q4o2
scenario 1: for combination of [q1o1,q201] a benefit b1 is allocated
scenario 2: for combination of [q1o1,q201,q303] a benefit b2 is allocated
scenario 3: for combination of [q201,q304] a benefit b3 is allocated
scenario 4: for combination of [q304,q401] a benefit b4 is allocated
scenario 5: for combination of [q402] a benefit b5 is allocated
scenario 6: for combination of [q1o2,q2o2,q3o1,q4o1] a benefit b5 is allocated
So in this way
( (3+1) C 1 x (4+1) C 1 x (5+1) C 1 x (2+1) C 1 ) - 1
( 4 x 5 x 6 x 3 ) - 1
360 - 1
359
scenarios can be build. where as C denote to Combination.
And if questions goes to 25 and each question should have 5 options
((5+1) ^ 25 - 1)
6 ^ 25 -1
28430288029929701375
scenarios can be build
I am looking for a best way to store this relational/combinational data to the database and want to access it back. Will wait for response of you guys.
The following set of tables will do it.
question:
id
...
option:
id
question_id
...
option_scenario:
option_id
scenario_id
scenario:
id
option_count
...
scenario_benefit:
scenario_id
benefit_id
benefit:
id
...
The one thing that is denormalized in the design is that scenario.option_count should be the count of things in option_scenario with that scenario_id.
To query it you'll need to use subqueries heavily. Suppose that person_option is another table with the options a specific person has. Then to find the benefits that that person has you'll need to:
SELECT b.*
FROM (
SELECT s.scenario_id
FROM person_option po
JOIN scenario_option so
ON so.option_id = po.option_id
JOIN scenario s
ON s.id = so.scenario_id
WHERE po.person_id = ?
GROUP BY so.scenario_id, s.option_count
HAVING s.option_count = COUNT(DISTINCT po.option_id)
) ps
JOIN scenario_benefit sb
ON sb.scenario_id = ps.scenario_id
JOIN benefit b
ON b.id = sb.benefit_id

Resources