Creating a database schema from The Movie Database

Creating a database schema from The Movie Database - database

I'm trying to create a database schema using information pulled from the themoviedb api.
I thought I was doing ok until I went to add in the television series, then I got really confused.
The TMDb API seems to treat television series and movies as completely separate things. It further divides television listings into series, seasons, and episodes.
For example there is a separate cast listing for television seasons (season regulars) and individual episodes (guest cast). I have no idea how to reflect all this in the database.
I've tried my best to model everything below, but I think there's something wrong somewhere. Please ignore the datatypes.
Role can be either writer, director, or actor.

http://imgur.com/a/1WKQB
Hi user2146821,
Your database design looks good, with the exception of how to display the relations between regular cast and guest cast members, as you've expressed.
Currently, you are approaching the scenario by having a singular join table between Movie, TV Seasons, TV Episodes and Person. This creates a table for which you cannot have either a singular primary key nor a correct composite primary key, as you will have nulls for any given record.
In the linked image above, you can see another way of handling this relationship - you create three join tables, each with Person on one side and a corresponding table on the other (either Movie, TV Season or TV Episode). This eliminates nulls from the join tables, allows for composite primary keys to be formed in the joins tables and structures the database in a more meaningful way.

Related

Should i make seperate tables for each album?

I'm working on a database project about music and albums in MySQL, where i make a list over some popular artists, their most sold album, and the songs contained within them. But i suddenly got really uncertain of what to do when it comes to filling in the name of the songs for each album. Should i make an individual table for each list of songs, or should all the songs (about 50 of them in total from all the albums) from all the different artists (5 different artists) be filled inn in the same table (i'm eventually gonna export the data and connect it to a PHP folder
Hope the question was clear

All the songs should be in one "Songs" table. Then you create a column "Album ID" in that table which is a foreign key back to the ID column in the albums table. This is how you know which song belongs to which album. (And of course you have the same kind of relationship between "album" and "artist".)
This is called a "one-to-many" relationship and is one of the basic principles of relational database design.
If you ever find yourself creating multiple tables to represent the same kind of data item, you know you've gone wrong.
N.B. If you want to support the idea that the same song (or track probably, to be more accurate, since many different recordings of a song could potentially be made) can be included on multiple albums, then you'll need to implement a "many-to-many" relationship where you have an extra table in between "albums" and "songs" which holds Album ID and Song ID. Each would be a foreign key back to the Albums and Songs tables, respectively. And to ensure no duplication, both fields would be specified as a Compound Primary Key. That way you can list the same Song ID in that table many times against different albums. Same again if you want to have that flexibility in the relationship between "artists" and "albums".
This might be a good time to take a break and study relational database design and data normalisation concepts in some more detail, then you can start to see these patterns for yourself and make the right decision in your schema designs.

Similarly to this question about databases for playlists you should also use one table for the albums and one table for the songs.
Additionally you might also need a table for artists, etc.

Splitting orderDetail into two tables for a database?

I will try my best to phrase this in a way that makes sense. I am working on a database project for my beginning database management course that uses a fictional scenario of a bookstore owner who wants me to create a database for them.
Essentially, the tables (or entities) that I have come up with are as follows:
Customer,
Product,
Order,
orderLine,
Book,
Author,
Publisher
To put it simply, I need configure this so that I can track both books and other nonbook items from sales. The issue that I am running into with this is that when I tried to just have one products table, I ran into the issue that books have a bunch of attributes that other items (such as bakery items). If I put books with other items, then there would be a whole lot of empty cells where there is no author/publisher/genre. From what my textbook has taught me so far, a composite table is needed for an orderDetail-type table, where the orderNumber and productNumber would combine. But here, I would need to somehow combine two seperate KEY attributes (for books and other items) into one order table, or some other method. This is especially confusing to me since some customers might buy a combination of books an other items in a single order, or they might only buy one type of thing. I was thinking that the ISBN would be an excellent identifier key for the Book table. What kind of configuration would I need to track orders like this?

Add one more table , product-props and store attributes in that table. you can keep all products in one table i.e books and other items as well. you can move author, publisher as props of this as well

Where should I store repetitive data in Access?

I'm creating this little Access DB, for the HR department to store all data related to all the training sessions that the company organizes for all the employees.
So, I have a Training Session table with information like date, subject, place, observations, trainer, etc, and the unique ID number.
Then there's the Personnel table, with employer ID (which is also the unique table number), names and working department.
So, after that I need another table that keeps a record of all the attendants of each training session. And here's the question, should I use a table for that in the first place? Does it have to be one table for each training session to store the attendants?
I've used excel for quite some time now, but I'm very new to Access and databases (even small ones like this). Any information will be highly appreciated.
Thanks in advance!

It should be one table for persons, one table for trainings, and one for participation/attendance, to minimize (or best: avoid) repetition. Your tables should use primary and foreign keys, so that there are one-to-many relationships between trainings and attendances as well as people and attendances (the attendances table would then have a column referring to the person who attended, and another column referring to the training session).
Google "database normalization" for more detail and variations of that principle (https://en.wikipedia.org/wiki/Database_normalization).

Linking an address table to multiple other tables

I have been asked to add a new address book table to our database (SQL Server 2012).
To simplify the related part of the database, there are three tables each linked to each other in a one to many fashion: Company (has many) Products (has many) Projects and the idea is that one or many addresses will be able to exist at any one of these levels. The thinking is that in the front-end system, a user will be able to view and select specific addresses for the project they specify and more generic addresses relating to its parent product and company.
The issue now if how best to model this in the database.
I have thought of two possible ideas so far so wonder if anyone has had a similar type of relationship to model themselves and how they implemented it?
Idea one:
The new address table will additionally contain three fields: companyID, productID and projectID. These fields will be related to the relevant tables and be nullable to represent company and product level addresses. e.g. companyID 2, productID 1, projectID NULL is a product level address.
My issue with this is that I am storing the relationship information in the table so if a project is ever changed to be related to a different product, the data in this table will be incorrect. I could potentially NULL all but the level I am interested in but this will make getting parent addresses a little harder to get
Idea two:
On the address table have a typeID and a genericID. genericID could contain the IDs from the Company, Product and Project tables with the typeID determining which table it came from. I am a little stuck how to set up the necessary constraints to do this though and wonder if this is going to get tricky to deal with in the future
Many thanks,

I will suggest using Idea one and preventing Idea two.
Second Idea is called Polymorphic Association anti pattern
Objective: Reference Multiple Parents
Resulting side effect: Using dual-purpose foreign key will violating first normal form (atomic issue), loosing referential integrity
Solution: Simplify the Relationship
The simplification of the relationship could be obtained in two ways:
Having multiple null-able forging keys (idea number 1): That will be
simple and applicable if the tables(product, project,...) that using
the relation are limited. (think about when they grow up to more)
Another more generic solution will be using inheritance. Defining a
new entity as the base table for (product, project,...) to satisfy
Addressable. May naming it organization-unit be more rational. Primary key of this organization_unit table will be the primary key of (product, project,...). Other collections like Address, Image, Contract ... tables will have a relation to this base table.

It sounds like you could use Junction tables http://en.wikipedia.org/wiki/Junction_table.
They will give you the flexibility you need to maintain your foreign key restraints, as well as share addresses between levels or entities if that is desired.
One for Company_Address, Product_Address, and Project_Address

Table "Inheritance" in SQL Server

I am currently in the process of looking at a restructure our contact management database and I wanted to hear peoples opinions on solving the problem of a number of contact types having shared attributes.
Basically we have 6 contact types which include Person, Company and Position # Company.
In the current structure all of these have an address however in the address table you must store their type in order to join to the contact.
This consistent requirement to join on contact type gets frustrating after a while.
Today I stumbled across a post discussing "Table Inheritance" (http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server).
Basically you have a parent table and a number of sub tables (in this case each contact type). From there you enforce integrity so that a sub table must have a master equivalent where it's type is defined.
The way I see it, by this method I would no longer need to store the type in tables like address, as the id is unique across all types.
I just wanted to know if anybody had any feelings on this method, whether it is a good way to go, or perhaps alternatives?
I'm using SQL Server 05 & 08 should that make any difference.
Thanks
Ed

I designed a database just like the link you provided suggests. The case was to store the data for many different technical reports. The number of report types is undefined and will probably grow to about 40 different types.
I created one master report table, that has an autoincrement primary key. That table contains all common information like customer, testsite, equipmentid, date etc.
Then I have one table for each report type that contains the spesific information relating to that report type. That table have the same primary key as the master and references the master as well.
My idea for splitting this into different tables with a 1:1 relation (which normally would be a no-no) was to avoid getting one single table with a huge number of columns, that gets very difficult to maintain as your constantly adding columns.
My design with table inheritance gave me segmented data and expandability without beeing difficult to maintain. The only thing I had to do was to write special a special save method to handle writing to two tables automatically. So far I'm very happy with the design and haven't really found any drawbacks, except for a little more complicated save method.

Google on "gen-spec relational modeling". You'll find a lot of articles discussing exactly this pattern. Some of them focus on table design, while others focus on an object oriented approach.
Table inheritance pops up in a few of them.

I know this won't help much now, but initially it may have been better to have an Entity table rather than 6 different contact types. Then each Entity could have as many addresses as necessary and there would be no need for type in the join.

You'll still have the problem that if you want the sub-type fields and you have only the master contact, you'll have to know what table to go looking at - or else join to all of them. But otherwise this is a workable solution to a common problem.
Another possibility (fairly similar in structure, but different in how you think of it) is to simply put all your contacts into one table. Then for the more specific fields (birthday say for people and department for position#company) create separate tables that are associated with that contact.
Contact Table
--------------
Name
Phone Number
Address Table
-------------
Street / state, etc
ContactId
ContactBirthday Table
--------------
Birthday
ContactId
Departments Table
-----------------
Department
ContactId
It requires a different way of thinking of things though - instead of thinking of people vs. companies, you think of the various functional requirements for the task at hand - if you want to send out birthday cards, get all the contacts that have birthdays associated with them, etc..

I'm going to go out on a limb here and suggest you should rethink your normalization strategy (as you seem to be lucky enough to be able to rethink your schema quite fundamentally). If you typically store an address for each contact, then your contact table should have the address fields in it. Alternatively if the address is stored per company then the address should be stored in the company table and your contacts linked to that company.
If your contacts only have one address, or one (or even 3, just not 'many') instance of the other fields, think about rationalizing them into a single table. In my experience having a few null fields is a far better alternative than needing left joins to data you aren't sure exists.
Fortunately for anyone who vehemently disagrees with me you did ask for opinions! :) IMHO you should only normalize when you really need to. Where you are rethinking schemas, denormalization should be considered at every opportunity.

When you have a 7th type, you'll have to create another table.

I'm going to try this approach. Yes, you have to create new tables when you have a new type, but since this table will probably have different columns, you'll end up doing this anyway if you don't use this scheme.
If the tables that inherit the master don't differentiate much from one another, I'd recommend you try another approach.

May I suggest that we just add a Type table. Ie a person has an address, name etc then the student, teacher as each use case presents its self we have a PersonType table that has an entry from the person table to n types and the subsequent new tables teacher, alien, singer as the system eveolves...

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight