I want to design a database to store crossword puzzles,
mainly I have 2 tables:
Questions Table, Grid Table,
Questions Table(q_id, question , answer...)
Grids Table (g_id, name, ....)
when I come to relate Questions Table with Grids Table, I am thinking of a good way,
Questions_Grid(q_id, g_id), the pair would be a primary key,
another solution that my boss suggested : Grids Table ( g_id, q_ids, ....)
q_ids store all the ids of the question used in this grid
which one is better? and if you have better options ?
It looks like your boss's suggestion is to store a list of question ids as a text column in the grid table. If I understand what you're asking and that's really it, the first one is much better, of course, because it's normalized. In your scheme, you can make many useful queries that would be messy or impossible (and slow, if possible) in your boss's scheme.
The first option is better, in that way the schema of your database does not restrict the number of questions per grid. A rule of thumb is that when you have to change the schema to make the application scale you haven't got an optimal schema.
a link table with one tuple pair is more normalized.
Question_grid
-------------
q_id
g_id
this will allow you to have as many questions as necessary for the grid, and no more.
if you try to hardcode the list, then each grid will need the same number of questions, or will have blanks or something.. no good.
Well the rule is to make a linking table when you have a "one"(grid) on "many"(questions) relationship. So your option would be the best answer, its normalized!
Related
I have to take an online course on DB design once again since I got a really lazy teacher that I thought had taught us everything and I continue to find out he didn't.
I'm designing a small DB in which two particular tables brought up this question.
I have a table called "Athlete" which stores Athlete info and a second table called "EntryInfo" which stores a guy's objectives, if he was a referral by another athlete.
There is no way an athlete could have more than one of this entry infos, so I thought idAthlete would apply to both "Athlete" and "EntryInfo" but I don´t know if this is correct or not. Now I have these questions:
1) In trying to keep "Athlete" table as clean as possible I didn't include this "EntryInfo" in the "Athlete" table from the beginning but it COULD be in the same table. Is this the best way to handle it? Regarding good practices in DB design should they be in 1 or 2 tables?
2) If it´s better to keep it in two separate tables, can I have idAthlete as PK in Athlete table (identity, incremental) and have it also as a PK in Entry Info only as a FK? or would it be a better practice to have a PK identity incremental idEntryInfo on EntryInfo table with a FK idAthlete?
I know this is such a basic question and I know I should take a course on DB design and normalisation (and I will do).
When you have two tables with the same key it's called vertical partitioning and it's a valid design for various reasons.
However I don't see any reasons in your explanation. I only see your statement keep "Athlete" table as clean as possible, which has a pretty general meaning. If you're going to put different groups of fields into different tables you can categorise that any number of ways
If you had a zillion records and you had performance issues it might be worth considering.
It will be simpler for you if you keep it in one table, then you don't have to fiddle about synchronising keys between the tables
Today I was designing a database for a potential personal project of mine. Since I couldn't decide what would be a better option I asked my teacher Databases, unfortunately he couldn't tell me which of the two options is better than the other and why.
I designed the database for a dummy data generator. Since I want to generate multilangual data I thought of these tables. (But its a simplification of the tables).
(first and last)names: id, name
streets: id, name
languages: id, name
Each names.name and streets.name originates from a language, sometimes a name can have multiple origins (ex: Nick is both a Dutch as an English name).
Each language has multiple names and streets.
These two rules result in a Many-to-Many relationship. At the moment I've got only two tables, but I know I will get between 10 and 20 of these kind of tables.
The regular way one would do this is just make 10 to 20 Many-to-Many relationship tables.
Another idea I came up with was just one Many-to-Many table with a third column which specifies which table the id relates to.
At the moment I've got the design on my other PC so I will update it with my ideas visualized after dinner (2 hours or so).
Which idea is better and why?
To make the project idea a bit clearer:
It is always a hassle to create good and enough realistic looking working data for projects. This application will generate this data for you and return the needed SQL so you only have to run the queries.
The user comes to the site to get the data. He states his tablename, his columnnames and then he can link the columnnames to types of data, think of:
* Firstname
* Lastname
* Email adress (which will be randomly generated from the name of the person)
* Adress details (street, housenumber, zipcode, place, country)
* A lot more
Then, after linking columns with the types the user can set the number of rows he wants to make. The application will then choose a country at random and generate realistic looking data according to the country they live in.
That's actually an excellent question. This sort of thing leads to a genuine problem in database design and there is a real tradeoff. I don't know what rdbms you are using but....
Basically you have four choices, all of them with serious downsides:
1. One M-M table with check constraints that only one fkey can be filled in besides language and one column per potential table. Ick....
2. One M-M table per relationship. This makes things quite hard to manage over time especially if you need to change something from an int to a bigint at some point.
3. One M-M table with a polymorphic relationship. You lose a lot of referential integrity checks when you do this and to make it safe, have fun coding (and testing!) triggers.
4. Look carefully at the advanced features in your rdbms for a solution. For example in postgresql this can be solved with table inheritance. The downside is that you lose portability and end up in advanced territory.
Unfortunately there is no single definite answer. You need to consider the tradeoffs carefully and decide what makes sense for your project. If I was just working with one RDBMS, I would do the last one. But if not, I would probably do one table per relationship and focus on tooling to manage the problems that come up. But the former preference is about my level of knowledge and confidence, and the latter is a bit more of a personal opinion.
So I hope this helps you look at the tradeoffs and select what is right for you.
I am building a dynamic customer feedback form and I have come across a bit of a problem with the structure of the database. I have included an ERD showing what I am trying to do.
The idea is that I wish to try several different feedback forms and see which ones work the best. I can choose specific questions by having a link entity between the FeedbackForm and the Question table. I am not quite sure how the tables should be linking. If you look at the ERD I believe that I should rename "FeedbackType" to "FeedbackTemplate".
The tables are:
FeedbackType
FeedbackForm
Question
FeedbackFormQuestion (link entity)
The problem is that the feedback type (which is a type of template) does not "know" the Question table but I think it should. The problem then is that: if I link it the Question table then it will all join together in a circle. I have a feeling this is incorrect! It may be fine doing this - I am not entirely sure on this.
Just for some idea of what the questions are I would have something like:
How easy was it to use the site?
Very difficult
Not very easy
Satisfactory
Good
Very Easy
It may be a simple problem that I am simply overlooking or just a lack of experience with this type of problem. I am happy with any form of advice but I wish to make sure this is professionally done. It is not that I cannot get something to work - it is more a case of making sure that it is done professionally.
You can use a triple relationship between FbType, FbForm and Questions tables.
An example for relationship table between them:
FBFId FBTId QId
-----------------------------
1 1 1
1 1 2
1 1 3
This way you can link with 3 tables.
You can also add score column to this table and this way getting rid of
the FeedbackFormQuestion table. This way you have 3 table with one relationship table.
For example I have photos and videos tables, I can comment on these, but when I send it to database which way is better?
To have 2 tables for comments:
photo_comments and
video_comments
Or to have 1 table comments and
create a row inside the table like
type and put there if it's a
photo_comment or video_comment
I think the 1 is faster because I have less data when I need to query the table but maybe the 2 is easier to use.
Please let me know what's the best way, speed is very important for me.
I'm talking about a very big system with millions of data, millions of comments, so I want the fastest way to get the results, for me doesn't matter if I need to code more or need to keep in mind something in plus, results are much more important!
If you really have two separate data tables photos and videos, I would always choose to use two separate comments tables, too.
Why?
If you put all your comments into a single comments table, but that references media from two separate data tables, there's no way you can easily set up a referential integrity between your comments table and the two data tables. There are some workarounds (like having two separate reference fields, one for each), but none are really very compelling. Not having a referential integrity will ultimately lead to "zombie" data that doesn't belong to any existing media entry.
Having two comments tables allows each comment table to properly reference its associated data table, thus your data integrity in the database will be better.
For that reason, if you have two separate data tables, I would always choose to use two separate comments tables as well.
It depends a bit more on how photos and videos are structured. Consider the following DB Design:
MediaType
----------
ID *
Name
Media
----------
ID *
TypeID
OwnerName
Name
Size
Path
Photo
----------
MediaID *
MediaTypeID (constraint, always set to the photo type)
Height
Width
Video
---------
MediaID *
MediaTypeID (constraint, always set to the video type)
Rating
If Photo and Video both had a FK to MediaType and to Media, I would make Comments relate to the Media table instead of either one, and not to the Photos or Videos table directly. This is often the type of design I use when Photo and Video have a lot of common properties. It's especially useful when you want to do things like security because you aren't boxed into repeating the same visibility and ownership constructs on each type of media you're dealing with. It's also quite fast to query because many queries often look only for common properties, or just type-specific rows, so some tables don't need to be included. Designing the database by modeling these IS-A relationships also keeps your indexes highly selective, which means speed.
If you're locked into your design and Videos and Photos have no commmon "base table", then I would make a separate comments table for each.
Why not having only one comment table? Is there any diffrence between a comment on a video or a photo? If not you should only have a column that holds the foreign key for the video/photo the comment is poiting to and an additional column with the type ENUM that holds the information of the type of resource the comment is ment for.
Using an ENUM will keep your queries very fast (as it is saved as a number) and makes it easy to use string in your query.
Splitting up the tables would be better performance-wise, since you wouldn't have to query on an extra "comment type" column. The downside of doing things this way is not reusing code (possibly in the future, if you add comments to other things). But it doesn't sound like you're concerned with that.
I don't think that the choice of whether to have 1 or 2 tables for comments is going to have any appreciable impact on the performance of your application.
You should choose whichever one makes more sense in the context of your application.
For example, if comments on photos and comments on videos are both going to act in the same way then you should have one table, if however (for example) comments on videos are allowed to be twice as long as comments on photos, or comments on photos have an additional "ranking" field or something, then 2 tables would make more sense.
your queries will either look like
select * from comments where linked_id = 555
or
select * from comments where linked_id = 555 and comment_type = 1
(with comment type=1 meaning it's a video).
As long as comment type as an index, they will basically be just as fast.
Only thing I would consider, is columns. If video comments has a different set of comments from picture comments, split em up. If everything is the same, keep em together.
I'm working on a social networking system that will have comments coming from several different locations. One could be friends, one could be events, one could be groups--much like Facebook. What I'm wondering is, from a practical standpoint, what would be the simplest way to write a comments table? Should I do it all in one table and allow foreign keys to all sorts of different tables, or should each distinct table have its own comment table? Thanks for the help!
A single comments table is the more elegant design, I think. Rather than multiple FKs though, consider an intermediate table - CommentedItem. So Friend, Event, Group, etc all have FKs to CommentedItem, and you create a CommentedItem row for each new row in each of those tables. Now Comments only needs one FK, to CommentedItem. For example, to get all Comments for a given Friend:
SELECT * FROM Comment c
JOIN CommentedItem ci on c.CommentedItemId = ci.CommentedItemId
JOIN Friend f on f.CommentedItemId = ci.CommentedItemId
WHERE f.FriendId = #FriendId
I've done both and the answer depends on the situation. For what you are trying to do, I would do a SINGLE "Comments" table, and then seperate "linker" tables. This will give you the best performance as you can achieve the "Perfect Index".
I would also recommend putting a "CommentTypeID" field in the Comments table to give a 'clue' as to which linker table you will pull from for the aditional detail.
EDIT: The CommentTypeID field should not be used in the indexes, but rather it's only for use in code.
one thing to be careful about is if you don't do a highly normalized database it can sometimes cause IO row chaining and table scans.
I believe oracle suggests performing a normalization model of about 3rd Normal form.
This is an equivalent question to this one.
EDIT: Based on a comment, it isn't clear that this is an equivalent question, so I spell it out below.
Both questions ask about projects (both happen to be Social Networks, but that's just coincidence) where there is a question about the performance of the database. Both have a diverse set of objects that share a common collection of attributes (in one it is Events, that occur on each object, in the other it is Comments that occur on each object).
Both questions effectively ask whether it is more efficient to create a UNION query that combines the disparate common features, or to factor them out into a common table, with appropriate foreign keys.
I see them as equivalent; the best answer to one will apply equally to the other.
(If you disagree, I am happy to hear why; please leave a comment.)
I would go for polymorphic associations. Many modern web development frameworks support it out of the box, which makes it really the simplest and most painless way to handle these kind of relationships.
Actually you can probably go to http://www.zazazine.com and look through their articles. You may find an answer there