Designing a good database - database

I have the following tree:
courses, that have many groups, each having multiple students.
So the drilled-down tree looks like this: courses -> groups -> students.
I think there are two ways to represent this:
1) students table that has group_id FK to groups; groups table that has course_id FK or:
2) first option plus students table having both group_id and course_id FKs so that I can have more freedom to fetch data without having to JOIN the "parent" table everytime.
One good example is to get all students that are part of a course (whatever the group). In this case, going with only the first option forces me to JOIN the groups table, which is not needed in that scope. So i tend to always choose the second option, even if the "main" table gets a few more columns of FKs.
How do you approach this?
The example gets more complicated if you add a couple more tables at the top of the courses table, like teachers (that teach courses) and schools (that has teachers). If you need to see all the students in a school, you need to join the groups, courses, teachers and schools.
Thank you!
LE: I am excluding many to many relationships from this example, those are treated differently.
LLE: And yes, if it sounds like convenience (aka performance)... it might be true :)

Personally, I try to keep a database as normalized as possible for clarity in the data model. So I would say if every student is going to be linked to the course through a group (no possibility of a student not in a group), than there should not be a relationship directly between students and courses. Don't sacrifice the clarity of your data model for having to write less SQL.
Also, you probably realize this, but you'll need linking tables for any many-to-many relationships. I'm not sure what kind of groups you're talking about, but if they can exist over multiple courses you'll need a Course --> CourseGroup (FK CourseID, FK GroupID) --> Group structure, and if students can belong to multiple groups you'll need Group --> GroupMembership (FK GroupID, FK StudentID)--> Student.

The addition of the course_id into the students table would be considered "denormalization" and is perfectly acceptable for exactly the reason you are trying to solve : "performance".
Denormalization Wiki
In computing, denormalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data
So yeah, your second option is doing just this ... attempting to improve performance by adding redundant data.

Related

What is the best way to maintain shared fields between tables in database?

I want to create a database for school management system. The database has two tables contain shared fields like Students and Teachers.
For example, Student table has fields(id, name, phone, class), and Teacher table has fields(id, name, phone, department).
Is it better to make: a table called Person which has fields(id, name, phone), Student table has fields(id, person_id, class), and Teacher table has fields(id, person_id, department).
Which of the two ways is better?
Giving a direct answer to this question might be opinion based. There is no generally best strategy to design database.
Theoretical example: if there is a large amount of data you might want to think about performance: what and how you search and joins.
If you search mostly Students and Teachers you might not create Person table and you could search them easily. Then if you would like to search all Persons from db you would need to make two queries and a UNION between those and with fields that are common to both types of Person.
If you search also Persons more frequently then you might create Person table and implement Teacher and Student to have foreign key to Person. Then when searching two last mentioned you would need to make JOIN to Person.
In real life I do not know if in this use case there is really a big difference. More important is that you select some strategy and follow it in your future decisions in order to keep the desing clear.
However there might come situation where it is need to change the selected strategy still. So theoretically.
Related question here

Entity Relationship Diagram. How does the IS A relationship translate into tables?

I was simply wondering, how an ISA relationship in an ER diagram would translate into tables in a database.
Would there be 3 tables? One for person, one for student, and one for Teacher?
Or would there be 2 tables? One for student, and one for teacher, with each entity having the attributes of person + their own?
Or would there be one table with all 4 attributes and some of the squares in the table being null depending on whether it was a student or teacher in the row?
NOTE: I forgot to add this, but there is full coverage for the ISA relationship, so a person must be either a studen or a teacher.
Assuming the relationship is mandatory (as you said, a person has to be a student or a teacher) and disjoint (a person is either a student or a teacher, but not both), the best solution is with 2 tables, one for students and one for teachers.
If the participation is instead optional (which is not your case, but let's put it for completeness), then the 3 tables option is the way to go, with a Person(PersonID, Name) table and then the two other tables which will reference the Person table, e.g.
Student(PersonID, GPA), with PersonID being PK and FK referencing Person(PersonID).
The 1 table option is probably not the best way here, and it will produce several records with null values (if a person is a student, the teacher-only attributes will be null and vice-versa).
If the disjointness is different, then it's a different story.
there are 4 options you can use to map this into an ER,
option 1
Person(SIN,Name)
Student(SIN,GPA)
Teacher(SIN,Salary)
option 2 Since this is a covering relationship, option 2 is not a good match.
Student(SIN,Name,GPA)
Teacher(SIN,Name,Salary)
option 3
Person(SIN,Name,GPA,Salary,Person_Type)
person type can be student/teacher
option 4
Person(SIN,Name,GPA,Salary,Student,Teacher) Student and Teacher are bool type fields, it can be yes or no,a good option for overlapping
Since the sub classes don't have much attributes, option 3 and option 4 are better to map this into an ER
This answer could have been a comment but I am putting it up here for the visibility.
I would like to address a few things that the chosen answer failed to address - and maybe elaborate a little on the consequences of the "two table" design.
The design of your database depends on the scope of your application and the type of relations and queries you want to perform. For example, if you have two types of users (student and teacher) and you have a lot of general relations that all users can part take, regardless of their type, then the two table design may end up with a lot of "duplicated" relations (like users can subscribe to different newsletters, instead of having one M2M relationship table between "users" and newsletters, you'll need two separate tables to represent that relation). This issue worsens if you have three different types of users instead of two, or if you have an extra layer of IsA in your hierarchy (part-time vs full-time students).
Another issue to consider - the types of constraints you want to implement. If your users have emails and you want to maintain a user-wide unique constraint on emails, then the implementation is trickier for a two-table design - you'll need to add an extra table for every unique constraint.
Another issue to consider is just duplications, generally. If you want to add a new common field to users, you'll need to do it multiple times. If you have unique constraints on that common field, you'll need a new table for that unique constraint too.
All of this is not to say that the two table design isn't the right solution. Depending on the type of relations, queries and features you are building, you may want to pick one design over the other, like is the case for most design decisions.
It depends entirely on the nature of the relationships.
IF the relationship between a Person and a Student is 1 to N (one to many), then the correct way would be to create a foreign key relationship, where the Student has a foreign key back to the Person's ID Primary Key Column. Same thing for the Person to Teacher relationship.
However, if the relationship is M to N (many to many), then you would want to create a separate table containing those relationships.
Assuming your ERD uses 1 to N relationships, your table structure ought to look something like this:
CREATE TABLE Person
(
sin bigint,
name text,
PRIMARY KEY (sin)
);
CREATE TABLE Student
(
GPA float,
fk_sin bigint,
FOREIGN KEY (fk_sin) REFERENCES Person(sin)
);
and follow the same example for the Teacher table. This approach will get you to 3rd Normal Form most of the time.

Clarification about storing courses in database

suppose if i need to take different college details and store courses offered by them into databases.Assume number of courses are different for different colleges.how table should be designed to store courses.
here these courses should be able retrieved for further processing
can any one suggest idea for this..
You can start with 2 tables, 1 for Institution (the university/college), and 1 for Course. The Course table should have a foreign key institution_id to the Institution table.
This way you can have as many courses as you want for any college, and looking up courses for a college is as simple as doing a query on institution_id.
Naturally, this is only a start, you will probably have to expand on this. For example, you might want to have another table like College that has a a foreign key to Institution, to model the fact that sometimes universities have many sub-schools within them. You could also have Institution rows reference other Institution rows to model the same thing; what you want to do depends on the details.

How do I organize such database in SQLite?

folks! I need some help with organizing database for application and I have no idea how to do it. Suppose following:
There is a list of academic subjects. For each subject we need to have a list of academic groups, which attend this subject. Then, for each group we need to have a list of dates. And for each date we need to have a list of students, and whether this student was present that day or not.
I have ugly data structures in my mind, will appreciate any help.
UPDATE
How do I see it:
Table1(the first col is date and second is list student's id, who were present)
10/10/11 | id1, id2, id3
10/11/11 | id1, 1d3, id5
Table2:
subject1 | id1 id2 id3
subject2 | id3 id2
And again, ids are id of groups. Dont know how to connect those tables.
There are many considerations to balance when designing a database, but based on the information you provided so far, something like this might be a good start:
This ER model uses a lot identifying relationships (i.e. "migrating" parent's primary key into child's PK) and results in natural primary keys, as opposed to non-identifying relationships that would require usage of surrogate keys. A lot of people like surrogate keys these days, but the truth is that both design strategies have pros and cons. In particular:
Natural keys are bigger (they "accumulate" fields over multiple levels of parent-child relationships).
But also, natural keys require less JOINing.
Natural keys can enforce constraints better in some special cases (such as diamond-shaped dependencies).
You will design one table for each kind of "thing" (subjects, groups, students, meetings) in your database. Each table will have one column for each datum (piece of information) you need to store about the thing. Additionally, there must be one column, or a predictable combination of columns, that will allow you to uniquely identify each thing (row) that you store in the table.
Then, you will decide how the things (subjects, groups, students, meetings) are related to each other and make sure that you have the correct columns in each table to store those relationships. You will discover that in some cases this can be done by adding one or more columns to the tables you already defined. In other cases, you will need to add a completely new table that doesn't store a "thing", per se, but rather a relationship between two things.
Once you have your list of tables and columns, if you feel that fails to represent some part of the problem correctly, post another question with the work you've already done and I'm sure you'll find someone to help you complete the assignment.
Response to your update:
You are on the wrong track. It is a bad idea (and contrary to correct relational database design) to ever store two values in a single field. So each of the tables you wrote about should have two columns (as you said), but the second column should store one and only one id. Instead of one row in table1 for 10/10/11, you would have three separate rows in your table.
But, before you start worrying about the "relationships", create tables to hold the "things".
I also suggest you pick up a basic guide to relational databases.

shared table for entities teachers and students?

I have to create a database and I want to know what would be the best solution.
Let's say I have to store information about students and teachers.
Should I make one single table containing all the personal information (name,email,phone password) to both students and teachers?
For additional information should I keep them in separate tables as ADD_TEACHERS and ADD_STUDENTS?
You may want to create a people table for the common data like name,email etc. You can then use a primary key column in this table, and if there has to be information specific to teacher (like course_instructor, head_teacher of class), then use that unique key from people as reference in your course_information table. Do the same for students too.
It sounds like your best route is to have a separate table for teachers and students. You'll have what is called a one-to-many relationship between teachers and students: one teacher, many students.
But that only covers one semester. When you consider other semesters, you have a many-to-many relationship between students and teachers.
In the end, you're best off with at least three tables: Teachers, Students, and teacher-to-student relationship.
First thing you have to do is Map the entity relationships (ER) and its attributes.then design the database and apply the normalization strategies.
In the proposed scenario students and teachers are different entities . and find out thier relationship
ONE student ONE TEACHER
MANY Students HAVE ONE TEACHER
MANY Students HAVE MANY TEACHERs
Then check out their attributes Example - Student(Name,Class,DOB)
It depends on how you are going to use the data. If you have a lot of use cases where you are interested in "persons", regardless of role, or if you have a lot of persons who are both teachers and students, you may want to learn the gen-spec design pattern, as it applies to relational tables.
See previous discussion.

Resources