Database Design and normalization in chess - database

I was wondering what was the better approach for my database/table design. As show in the picture, i have players who play a match. One player plays multiple matches and one match is played by multiple players, so it is a n:m relation. This could result in thress tables player(id, firstname), player_to_match(playerid, matchid), match(id).
In my case, the number of players never changes, it is always two (n=2). Which of the following designs is better?
(1)
player_to_match(matchid, playerid)
Having two rows for each map and one cell redundancy (matchid)
(2)
match(matchid, playerid1, playerid2)
As i said, the number of players per match can never change
Thank you
Lucas
[ERM-Diagram with two Entities: Player(ID, Firstname), Match(ID), n:m Assosiation from Player to Match titled "plays"]
http://fs1.directupload.net/images/141210/rmeuutpg.png

I'd stick with option (1). It will make it easier to answer such simple questions as "how many matches has player X played?" With option (2), you'd have to query two columns for the value X to answer that question and that starts to get ugly.

Use the 2-table design. For something like this, you don't need the extra complexity because there is no chance chess is ever going to need 3 players. Unless you watch Big Bang Theory...
I prefer to start with the simpler form, and then modify it later if needed. As developers, we tend to try to come up with a solution that will handle any future possibility, but most of the time it never happens, and we've wasted a lot of time building an elegant solution for a problem that doesn't exist. Go simple first.
If you do need the 3-table option, you have some extra work to do to make sure there are always 2 related records to a match, no more, no less. Make sure you can't delete a user that is attached to an existing match, or you will have a match with only one player. A few things like that you'll have to watch out for.

I would do this:
matchid
black (references PLAYER)
white (references PLAYER)
The number of players in the game is finite (two), which eliminates the rationale for a 1-to-n child table; each player moreover has a defined "role" (white vs black) and you'd want to be able to distinguish them in that manner.

Related

Should I store this in the database or in the code?

I'm creating a small game composed of weapons. Weapons have characteristics, like the accuracy. When a player crafts such a weapon, a value between min and max are generated for each characteristic. For example, the accuracy of a new gun is a number between 2 and 5.
My question is... should I store the minimum and maximum value in the database or should it be hard coded in the code ?
I understand that putting them in the database allows me to change these values easily, however these won't change very often and doing this mean having to make a database request when I need these values. Moreover, its means having way much more tables... however, is it a good practice to store this directly in the code ?
In conclusion, I really don't know what solution to chose as both have advantages and disadvantage.
If you have attributes of an entity, then you should store them in the database.
That is what databases are for, storing data. I can see no advantage to hardcoding such values. Worse, the values might be used in different places in your code. And, when you update them, you might end up with inconsistent values throughout the code.
EDIT:
If these are default values, then I can imagine storing them in the code along with all the other information about the weapon -- name of the weapon, category, and so on. Those values are the source information for the weapons.
I still think it would be better to have a Weapons table or WeaponDefaults table so these are in the database. Right now, you might think the defaults are only used in one place. You would be surprised how software can grow. Also, having them in the database makes the values more maintainable.
I would have to agree #Gordon_Linoff.
I Don't think you will end up with "way more tables", maybe one or two. If you had a table that had fields of ID, Weapon, Min, Max ...
Then you could do a recordset search when needed. As you said, these variables might never change but changing them in a single spot, seems much more Admin-Friendly then scouring code that you have let alone for a long time. My Two cents worth.

Serializing data into a single text field - denormalization gone too far?

I'd love some opinions on whether this database design I'm currently pursuing is sound or not.
Lets assume I'm building a table called "Home", this table has a text field called "rooms". In this field is the serialized data for a set of rooms that this house has. My first instinct was to, of course, normalize this data into a separate "Rooms" table. However, due to some frustrating experiences with overly normalized databases in the past, I stopped to ask myself a few questions:
Will I ever need to find a specific room?
Will I ever need to update an individual room?
Will any Home records ever share Room records?
The answer to each of these questions is "no". Room records are all unique to each Home. Queries will never need to be performed to find out how many Homes in the database have bathrooms, for instance. Data will always be pulled from the perspective of the Home. The number of bedrooms and bathrooms will be explicitly stored on the Home record for searching.
So instead of having to constantly join Rooms, I wondered what would be the harm in serializing this data and just popping it into a text field.
This makes a lot of sense to me, but I'm hoping for a sanity check. Thanks for any input!
A pragmatic answer...
a) probability that you might want to decompose it in the future
b) benefit of not doing so now
c) cost of changing the schema later on.
If a * c > b then you should decompose now.
Well, you might not have a need TODAY to query to find out things like:
What is the average number of bathrooms in a home in Ohio?
Where do homes have more bedrooms? The East Coast or the West Coast?
How does house price correlate with the size of the master bedroom? What would be the average dollar value return of increasing the master bedroom size by 30%?
etc, etc.
You will be in a much better position in the future if you design your foundation correctly to begin with... no matter how enticing the short-cut may seem right now.
Plus, with a separate ROOMS table, you will be able to add additional room fields that make sense later (like width/height, color, floor level, etc.) which would all be very hard if the data were just globbed into a single field.
People will want to query in unexpected ways, like:
I have bad knees. Can you list houses with the master bedroom and master bathroom on the first floor?
In general, having a ROOMS table will just make your application more powerful, and easier to use.
Hey, I get what you're saying about "overly normalized data". We've all been there, and it DOES bite. However, having a ROOMS table in a database with housing info isn't being "overly normalized". It's just building the app the right way.
In addition to what others have said about doing the right thing, I would like to add a comment about performance.
Since you will be storing the serialized room data as a column in table Home, the row size will increase significantly. This will result in worse performance for all other queries.
Well, you say that room records are unique, but you can't enforce that. So you have no way to know this for sure in your current design: all your code should be perfect in representing this.
"constantly joining" isn't that hard to do, but if it is, you can always make a View for that, and you're done.

Designing tables for storing various requirements and stats for multiplayer game

Original Question:
Hello,
I am creating very simple hobby project - browser based multiplayer game. I am stuck at designing tables for storing information about quest / skill requirements.
For now, I designed my tables in following way:
table user (basic information about users)
table stat (variety of stats)
table user_stats (connecting each user with stats)
Another example:
table monsters (basic information about npc enemies)
table monster_stats (connecting monsters with stats, using the same stat table from above)
Those were the simple cases. I must admit, that I am stuck while designing requirements for different things, e.g quests. Sample quest A might have only minimum character level requirement (and that is easy to implement) - but another one, quest B has multitude of other reqs (finished quests, gained skills, possessing specific items, etc) - what is a good way of designing tables for storing this kind of information?
In a similar manner - what is an efficient way of storing information about skill requirements? (specific character class, min level, etc).
I would be grateful for any help or information about creating database driven games.
Edit:
Thank You for the answers, yet I would like to receive more. As I am having some problems designing an rather complicated database layout for craftable items, I am starting a max bounty for this question.
I would like to receive links to articles / code snippets / anything connected with best practices of designing databases for storing game data (an good example of this kind of information is availibe on buildingbrowsergames.com).
I would be grateful for any help.
I'll edit this to add as many other pertinent issues as I can, although I wish the OP would address my comment above. I speak from several years as a professional online game developer and many more years as a hobbyist online game developer, for what it's worth.
Online games imply some sort of persistence, which means that you have broadly two types of data - one is designed by you, the other is created by the players in the course of play. Most likely you are going to store both in your database. Make sure you have different tables for these and cross-reference them properly via the usual database normalisation rules. (eg. If your player crafts a broadsword, you don't create an entire new row with all the properties of a sword. You create a new row in the player_items table with the per-instance properties, and refer to the broadsword row in the item_types table which holds the per-itemtype properties.) If you find a row of data is holding some things that you designed and some things that the player is changing during play, you need to normalise it out into two tables.
This is really the typical class/instance separation issue, and applies to many things in such games: a goblin instance doesn't need to store all the details of what it means to be a goblin (eg. green skin), only things pertinent to that instance (eg. location, current health). Some times there is a subtlety to the act of construction, in that instance data needs to be created based on class data. (Eg. setting a goblin instance's starting health based upon a goblin type's max health.) My advice is to hard-code these into your code that creates the instances and inserts the row for it. This information only changes rarely since there are few such values in practice. (Initial scores of depletable resources like health, stamina, mana... that's about it.)
Try and find a consistent terminology to separate instance data from type data - this will make life easier later when you're patching a live game and trying not to trash the hard work of your players by editing the wrong tables. This also makes caching a lot easier - you can typically cache your class/type data with impunity because it only ever changes when you, the designer, pushes new data up there. You can run it through memcached, or consider loading it all at start up time if your game has a continuous process (ie. is not PHP/ASP/CGI/etc), etc.
Remember that deleting anything from your design-side data is risky once you go live, since player-generated data may refer back to it. Test everything thoroughly locally before deploying to the live server because once it's up there, it's hard to take it down. Consider ways to be able to mark rows of such data as removed in a safe fashion - maybe a boolean 'live' column which, if set to false, means it just won't show up in the typical query. Think about the impact on players if you disable items they earned (and doubly if these are items they paid for).
The actual crafting side can't really be answered without knowing how you want to design your game. The database design must follow the game design. But I'll run through a trivial idea. Maybe you will want to be able to create a basic object and then augment it with runes or crystals or whatever. For that, you just need a one-to-many relationship between item instance and augmentation instance. (Remember, you might have item type and augmentation type tables too.) Each augmentation can specify a property of an item (eg. durability, max damage done in combat, weight) and a modifier (typically as a multiplier, eg. 1.1 to add a 10% bonus). You can see my explanation for how to implement these modifying effects here and here - the same principles apply for temporary skill and spell effects as apply for permanent item modification.
For character stats in a database driven game, I would generally advise to stick with the naïve approach of one column (integer or float) per statistic. Adding columns later is not a difficult operation and since you're going to be reading these values a lot, you might not want to be performing joins on them all the time. However, if you really do need the flexibility, then your method is fine. This strongly resembles the skill level table I suggest below: lots of game data can be modelled in this way - map a class or instance of one thing to a class or instance of other things, often with some additional data to describe the mapping (in this case, the value of the statistic).
Once you have these basic joins set up - and indeed any other complex queries that result from the separation of class/instance data in a way that may not be convenient for your code - consider creating a view or a stored procedure to perform them behind the scenes so that your application code doesn't have to worry about it any more.
Other good database practices apply, of course - use transactions when you need to ensure multiple actions happen atomically (eg. trading), put indices on the fields you search most often, use VACUUM/OPTIMIZE TABLE/whatever during quiet periods to keep performance up, etc.
(Original answer below this point.)
To be honest I wouldn't store the quest requirement information in the relational database, but in some sort of script. Ultimately your idea of a 'requirement' takes on several varying forms which could draw on different sorts of data (eg. level, class, prior quests completed, item possession) and operators (a level might be a minimum or a maximum, some quests may require an item whereas others may require its absence, etc) not to mention a combination of conjunctions and disjunctions (some quests require all requirements to be met, whereas others may only require 1 of several to be met). This sort of thing is much more easily specified in an imperative language. That's not to say you don't have a quest table in the DB, just that you don't try and encode the sometimes arbitrary requirements into the schema. I'd have a requirement_script_id column to reference an external script. I suppose you could put the actual script into the DB as a text field if it suits, too.
Skill requirements are suited to the DB though, and quite trivial given the typical game system of learning skills as you progress through levels in a certain class:
table skill_levels
{
int skill_id FOREIGN KEY;
int class_id FOREIGN KEY;
int min_level;
}
myPotentialSkillList = SELECT * FROM skill_levels INNER JOIN
skill ON skill_levels.skill_id = skill.id
WHERE class_id = my_skill
ORDER BY skill_levels.min_level ASC;
Need a skill tree? Add a column prerequisite_skill_id. And so on.
Update:
Judging by the comments, it looks like a lot of people have a problem with XML. I know it's cool to bash it now and it does have its problems, but in this case I think it works. One of the other reasons that I chose it is that there are a ton of libraries for parsing it, so that can make life easier.
The other key concept is that the information is really non-relational. So yes, you could store the data in any particular example in a bunch of different tables with lots of joins, but that's a pain. But if I kept giving you a slightly different examples I bet you'd have to modify your design ad infinitum. I don't think adding tables and modifying complicated SQL statements is very much fun. So it's a little frustrating that #scheibk's comment has been voted up.
Original Post:
I think the problem you might have with storing quest information in the database is that it isn't really relational (that is, it doesn't really fit easily into a table). That might be why you're having trouble designing tables for the data.
On the other hand, if you put your quest information directly into code, that means you'll have to edit the code and recompile each time you want to add a quest. Lame.
So if I was you I might consider storing my quest information in an XML file or something similar. I know that's the generic solution for just about anything, but in this case it sounds right to me. XML is really made for storing non-relation and/or hierarchical data, just like the stuff you need to store for your quest.
Summary: You could come up with your own schema, create your XML file, and then load it at run time somehow (or even store the XML in the database).
Example XML:
<quests>
<quest name="Return Ring to Mordor">
<characterReqs>
<level>60</level>
<finishedQuests>
<quest name="Get Double Cheeseburger" />
<quest name="Go to Vegas for the Weekend" />
</finishedQuests>
<skills>
<skill name="nunchuks" />
<skill name="plundering" />
</skills>
<items>
<item name="genie's lamp" />
<item name="noise cancelling headphones for robin williams' voice />
</items>
</characterReqs>
<steps>
<step number="1">Get to Mordor</step>
<step number="2">Throw Ring into Lava</step>
<step number="3">...</step>
<step number="4">Profit</step>
</steps>
</quest>
</quests>
It sounds like you're ready for general object oriented design (OOD) principles. I'm going to purposefully ignore the context (gaming, MMO, etc) because that really doesn't matter to how you do a design process. And me giving you links is less useful than explaining what terms will be most helpful to look up yourself, IMO; I'll put those in bold.
In OOD, the database schema comes directly from your system design, not the other way around. Your design will tell you what your base object classes are and which properties can live in the same table (the ones in 1:1 relationship with the object) versus which to make mapping tables for (anything with 1:n or n:m relationships - for exmaple, one user has multiple stats, so it's 1:n). In fact, if you do the OOD correctly, you will have zero decisions to make regarding the final DB layout.
The "correct" way to do any OO mapping is learned as a multi-step process called "Database Normalization". The basics of which is just as I described: find the "arity" of the object relationships (1:1, 1:n,...) and make mapping tables for the 1:n's and n:m's. For 1:n's you end up with two tables, the "base" table and a "base_subobjects" table (eg. your "users" and "user_stats" is a good example) with the "foreign key" (the Id of the base object) as a column in the subobject mapping table. For n:m's, you end up with three tables: "base", "subobjects", and "base_subobjects_map" where the map has one column for the base Id and one for the subobject Id. This might be necessary in your example for N quests that can each have M requirements (so the requirement conditions can be shared among quests).
That's 85% of what you need to know. The rest is how to handle inheritance, which I advise you to just skip unless you're masochistic. Now just go figure out how you want it to work before you start coding stuff up and the rest is cake.
The thread in #Shea Daniel's answer is on the right track: the specification for a quest is non-relational, and also includes logic as well as data.
Using XML or Lua are examples, but the more general idea is to develop your own Domain-Specific Language to encode quests. Here are a few articles about this concept, related to game design:
The Whimsy Of Domain-Specific Languages
Using a Domain Specific Language for Behaviors
Using Domain-Specific Modeling towards Computer Games Development Industrialization
You can store the block of code for a given quest into a TEXT field in your database, but you won't have much flexibility to use SQL to query specific parts of it. For instance, given the skills a character currently has, which quests are open to him? This won't be easy to query in SQL, if the quest prerequisites are encoded in your DSL in a TEXT field.
You can try to encode individual prerequisites in a relational manner, but it quickly gets out of hand. Relational and object-oriented just don't go well together. You can try to model it this way:
Chars <--- CharAttributes --> AllAttributes <-- QuestPrereqs --> Quests
And then do a LEFT JOIN looking for any quests for which no prereqs are missing in the character's attributes. Here's pseudo-code:
SELECT quest_id
FROM QuestPrereqs
JOIN AllAttributes
LEFT JOIN CharAttributes
GROUP BY quest_id
HAVING COUNT(AllAttributes) = COUNT(CharAttributes);
But the problem with this is that now you have to model every aspect of your character that could be a prerequisite (stats, skills, level, possessions, quests completed) as some kind of abstract "Attribute" that fits into this structure.
This solves this problem of tracking quest prerequisites, but it leaves you with another problem: the character is modeled in a non-relational way, essentially an Entity-Attribute-Value architecture which breaks a bunch of relational rules and makes other types of queries incredibly difficult.
Not directly related to the design of your database, but a similar question was asked a few weeks back about class diagram examples for an RPG
I'm sure you can find something useful in there :)
Regarding your basic structure, you may (depending on the nature of your game) want to consider driving toward convergence of representation between player character and non-player characters, so that code that would naturally operate the same on either doesn't have to worry about the distinction. This would suggest, instead of having user and monster tables, having a character table that represents everything PCs and NPCs have in common, and then a user table for information unique to PCs and/or user accounts. The user table would have a character_id foreign key, and you could tell a player character row by the fact that a user row exists corresponding to it.
For representing quests in a model like yours, the way I would do it would look like:
quest_model
===============
id
name ['Quest for the Holy Grail', 'You Killed My Father', etc.]
etc.
quest_model_req_type
===============
id
name ['Minimum Level', 'Skill', 'Equipment', etc.]
etc.
quest_model_req
===============
id
quest_id
quest_model_req_type_id
value [10 (for Minimum Level), 'Horseback Riding' (for Skill), etc.]
quest
===============
id
quest_model_id
user_id
status
etc.
So a quest_model is the core definition of the quest structure; each quest_model can have 0..n associated quest_model_req rows, which are requirements specific to that quest model. Every quest_model_req is associated with a quest_model_req_type, which defines the general type of requirement: achieving a Minimum Level, having a Skill, possessing a piece of Equipment, and so on. The quest_model_req also has a value, which configures the requirement for this specific quest; for example, a Minimum Level type requirement might have a value of 20, meaning you must be at least level 20.
The quest table, then, is individual instances of quests that players are undertaking or have undertaken. The quest is associated with a quest_model and a user (or perhaps character, if you ever want NPCs to be able to do quests!), and has a status indicating where the progress of the quest stands, and whatever other tracking turns out useful.
This is a bare-bones structure that would, of course, have to be built out to accomodate the needs of particular games, but it should illustrate the direction I'd recommend.
Oh, and since someone else threw around their credentials, mine are that I've been a hobbyist game developer on live, public-facing projects for 16 years now.
I'd be extremely careful of what you actually store in a DB, especially for an MMORPG. Keep in mind, these things are designed to be MASSIVE with thousands of users, and game code has to execute excessively quickly and send a crap-ton of data over the network, not only to the players on their home connections but also between servers on the back-end. You're also going to have to scale out eventually and databases and scaling out are not two things that I feel mix particularly well, particularly when you start sharding into different regions and then adding instance servers to your shards and so on. You end up with a whole lot of servers talking to databases and passing a lot of data, some of which isn't even relevant to the game at all (SQL text going to a SQL server is useless network traffic that you should cut down on).
Here's a suggestion: Limit your SQL database to storing only things that will change as players play the game. Monsters and monster stats will not change. Items and item stats will not change. Quest goals will not change. Don't store these things in a SQL database, instead store them in the code somewhere.
Doing this means that every server that ever lives will always know all of this information without ever having to query a database. Now, you don't store quests at all, you just store accomplishments of the player and the game programatically determines the affects of those quests being completed. You don't waste data transferring information between servers because you're only sending event ID's or something of that nature (you can optimize the data you pass by only using just enough bits to represent all the event ID's and this will cut down on network traffic. May seem insignificant but nothing is insignificant in massive network apps).
Do the same thing for monster stats and item stats. These things don't change during gameplay so there's no need to keep them in a DB at all and therefore this information NEVER needs to travel over the network. The only thing you store is the ID of the items or monster kills or anything like that which is non-deterministic (i.e. it can change during gameplay in a way which you can't predict). You can have dedicated item servers or monster stat servers or something like that and you can add those to your shards if you end up having huge numbers of these things that occupy too much memory, then just pass the data that's necessary for a particular quest or area to the instance server that is handling that thing to cut down further on space, but keep in mind that this will up the amount of data you need to pass down the network to spool up a new instance server so it's a trade-off. As long as you're aware of the consequences of this trade-off, you can use good judgement and decide what you want to do. Another possibility is to limit instance servers to a particular quest/region/event/whatever and only equip it with enough information to the thing it's responsible for, but this is more complex and potentially limits your scaling out since resource allocation will become static instead of dynamic (if you have 50 servers of each quest and suddenly everyone goes on the same quest, you'll have 49 idle servers and one really swamped server). Again, it's a trade-off so be sure you understand it and make good choices for your application.
Once you've identified exactly what information in your game is non-deterministic, then you can design a database around that information. That becomes a bit easier: players have stats, players have items, players have skills, players have accomplishments, etc, all fairly easy to map out. You don't need descriptions for things like skills, accomplishments, items, etc, or even their effects or names or anything since the server can determine all that stuff for you from the ID's of those things at runtime without needing a database query.
Now, a lot of this probably sounds like overkill to you. After all, a good database can do queries very rapidly. However, your bandwidth is extremely precious, even in the data center, so you need to limit your use of it to only what is absolutely necessary to send and only send that data when it's absolutely necessary that it be sent.
Now, for representing quests in code, I would consider the specification pattern (http://en.wikipedia.org/wiki/Specification_pattern). This will allow you to easily build up quest goals in terms of what events are needed to ensure that the specification for completing that quest is met. You can then use LUA (or something) to define your quests as you build the game so that you don't have to make massive code changes and rebuild the whole damn thing to make it so that you have to kill 11 monsters instead of 10 to get the Sword of 1000 truths in a particular quest. How to actually do something like that I think is beyond the scope of this answer and starts to hit the edge of my knowledge of game programming so maybe someone else on here can help you out if you choose to go that route.
Also, I know I used a lot of terms in this answer, please ask if there are any that you are unfamiliar with and I can explain them.
Edit: didn't notice your addition about craftable items. I'm going to assume that these are things that a player can create specifically in the game, like custom items. If a player can continually change these items, then you can just combine the attributes of what they're crafted as at runtime but you'll need to store the ID of each attribute in the DB somewhere. If you make a finite number of things you can add on (like gems in Diablo II) then you can eliminate a join by just adding that number of columns to the table. If there are a finite number of items that can be crafted and a finite number of ways that differnet things can be joined together into new items, then when certain items are combined, you needn't store the combined attributes; it just becomes a new item which has been defined at some point by you already. Then, they just have that item instead of its components. If you clarify the behavior your game is to have I can add additional suggestions if that would be useful.
I would approach this from an Object Oriented point of view, rather than a Data Centric point of view. It looks like you might have quite a lot of (poss complex) objects - I would recommend getting them modeled (with their relationships) first, and relying on an ORM for persistence.
When you have a data-centric problem, the database is your friend. What you have done so far seems to be quite right.
On the other hand, the other problems you mention seem to be behaviour-centric. In this case, an object-oriented analisys and solution will work better.
For example:
Create a quest class with specificQuest child classes. Each child should implement a bool HasRequirements(Player player) method.
Another option is some sort of rules engine (Drools, for example if you are using Java).
If i was designing a database for such a situation, i might do something like this:
Quest
[quest properties like name and description]
reqItemsID
reqSkillsID
reqPlayerTypesID
RequiredItems
ID
item
RequiredSkills
ID
skill
RequiredPlayerTypes
ID
type
In this, the ID's map to the respective tables then you retrieve all entries under that ID to get the list of required items, skills, what have you. If you allow dynamic creation of items then you should have a mapping to another table that contains all possible items.
Another thing to keep in mind is normalization. There's a long article here but i've condensed the first three levels into the following more or less:
first normal form means that there are no database entries where a specific field has more than one item in it
second normal form means that if you have a composite primary key all other fields are fully dependent on the entire key not just parts of it in each table
third normal is where you have no non-key fields that are dependent on other non-key fields in any table
[Disclaimer: i have very little experience with SQL databases, and am new to this field. I just hope i'm of help.]
I've done something sort of similar and my general solution was to use a lot of meta data. I'm using the term loosely to mean that any time I needed new data to make a given decision(allow a quest, allow using an item etc.) I would create a new attribute. This was basically just a table with an arbitrary number of values and descriptions. Then each character would have a list of these types of attributes.
Ex: List of Kills, Level, Regions visited, etc.
The two things this does to your dev process are:
1) Every time there's an event in the game you need to have a big old switch block that checks all these attribute types to see if something needs updating
2) Everytime you need some data, check all your attribute tables BEFORE you add a new one.
I found this to be a good rapid development strategy for a game that grows organically(not completely planned out on paper ahead of time) - but it's one big limitation is that your past/current content(levels/events etc) will not be compatible with future attributes - i.e. that map won't give you a region badge because there were no region badges when you coded it. This of course requires you to update past content when new attributes are added to the system.
just some little points for your consideration :
1) Always Try to make your "get quest" requirements simple.. and "Finish quest" requirements complicated..
Part1 can be done by "trying to make your quests in a Hierarchical order":
example :
QuestA : (Kill Raven the demon) (quest req: Lvl1)
QuestA.1 : Save "unkown" in the forest to obtain some info.. (quest req : QuestA)
QuestA.2 : Craft the sword of Crystal ... etc.. (quest req : QuestA.1 == Done)
QuestA.3 : ... etc.. (quest req : QuestA.2 == Done)
QuestA.4 : ... etc.. (quest req : QuestA.3 == Done)
etc...
QuestB (Find the lost tomb) (quest req : ( QuestA.statues == Done) )
QuestC (Go To the demons Hypermarket) ( Quest req: ( QuestA.statues == Done && player.level== 10)
etc....
Doing this would save you lots of data fields/table joints.
ADDITIONAL THOUGHTS:
if you use the above system, u can add an extra Reward field to ur quest table called "enableQuests" and add the name of the quests that needs to be enabled..
Logically.. you'd have an "enabled" field assigned to each quest..
2) A minor solution for Your crafting problem, create crafting recipes, Items that contains To-be-Crafted-item crafting requirements stored in them..
so when a player tries to craft an item.. he needs to buy a recipe 1st.. then try crafting..
a simple example of such item Desc would be:
ItemName: "Legendary Sword of the dead"
Craftevel req. : 75
Items required:
Item_1 : Blade of the dead
Item_2 : A cursed seal
item_3 : Holy Gemstone of the dead
etc...
and when he presses the "craft" Action, you can parse it and compare against his inventory/craft box...
so Your Crafting DB will have only 1 field (or 2 if u want to add a crafting LvL req. , though it will already be included in the recipe.
ADDITIONAL THOUGHTS:
Such items, can be stored in xml format in the table .. which would make it much easier to parse...
3) A similar XML System can be applied to Your quest system.. to implement quest-ending requirements..

Is normalizing a person's name going too far?

You usually normalize a database to avoid data redundancy. It's easy to see in a table full of names that there is plenty of redundancy. If your goal is to create a catalog of the names of every person on the planet (good luck), I can see how normalizing names could be beneficial. But in the context of the average business database is it overkill?
(Of course I know you could take anything to an extreme... say if you normalized down to syllables... or even adjacent character pairs. I can't see a benefit in going that far)
Update:
One possible justification for this is a random name generator. That's all I could come up with off the top of my head.
Yes, it's an overkill.
People don't change their names from Bill to Joe all at once.
Database normalization usually refers to normalizing the field, not its content. In other words, you would normalize that there only be one first name field in the database. That is generally worthwhile. However the data content should not be normalized, since it is individual to that person - you are not picking from a list, and you are not changing a list in one place to affect everybody - that would be a bug, not a feature.
How do you normalize a name? Not all names have the same structure. Not all countries or cultures use the same rules for names. A first name is not necessarily just a first name. People have variable numbers of names. Some countries don't have the simple pair of firstname/lastname. What if my first name just so happens to be your last name, should they be considered the same in your database? If not, then you get into the problem that last name might mean different things in different countries. In most countries I know of, it is a family name. Your last name is the same as at least one of your parents' last name. On Iceland, it is your father's first name, followed by "son" or "daughter". So the same last name will mean completely different things depending on whether you encounter it in Iceland and the US.
In some cultures it is common when getting married, for the woman to take her husband's last name. In other cultures, that's completely optional, or might even work the opposite way.
How can you normalize this? What information would it gain you? If you find someone in your database who has "Smith" as the last word making up their name, what does that tell you? It might not be their family name. It might only be part of the family name. It might be an honorary in some language, but which according to their culture, should be considered part of the name.
You can only normalize data if it follows a common structure.
If you had a need to perform queries based on diminutive names I could see a need for normalizing the names. e.g. a search for "Betty" may need to return results for "Betty", "Beth", and "Elizabeth"
Yes, definitely overkill. What's a few dozen bytes betewen friends?
Maybe if you work in the Census office it might make sense. Otherwise, see every other answer :)
I would say yes, it is going too far in 95%+ of the cases.
Yes. I cannot think of an instance where the benefits outweigh the problems and query complications.
No, but you might want to normalise to a canonical record for a customer (so you don't get 5 different entries for 'Bloggs & Co.' in your database. This is a data cleansing issue that often bites on MIS projects.
You often don't go over fourth form normalization in a database. Therefore seventh form normalization is quite a bit overboard. The only place this might even be a remotely plausible idea is in some kind of massive data warehouse.
Generally yes. Normalizing to that level would be going to far. Depending on the queries (such as phone books where searches by last name are common) it might be worthwhile. I expect that to be rare.
I generally haven't seen a need to normalize the name, mainly because that adds a performance hit on the join that will always be called, and doesn't give any benefit.
If you have so many similar names, and have a storage problem then it may be worth it, but there will be a performance hit that would need to be considered.
I would say it is absolutely overkill. In most applications, you display folks' names so often, every query involved with that is going to look that much more complex and harder to read.
Yes, it is. It is commonly recognized that just applying all of the Rules of Normalization can cause you to go way too far and end up with an overnormalized database. For example, it would be possible to normalize every instance of every character to a reference to a character enumeration table. It's easy to see that that's ridiculous.
Normalization needs to be performed at a level that is appropriate for your problem domain. Overnormalization is as much a problem as undernormalization (although, of course, for different reasons).
There might be a case where being able to link married/maiden names would be useful.
Recently had a case where I had to rename thousands of emails in exchange because somebody got divorced and didn't want any emails listing her as married_name#company.com
No need to normalize to that level unless the names make up a composite primary key and you have data that is dependant on one of the names (e.g. anyone with the surname Plummer knows nothing about databases). In which case, by not normalizing, you would violate second normal form.
I agree with the general response, you wouldn't do that.
One thing comes to mind though, compression. If you had a billion people and you found that 60% of first names were pulled from 5 very common names, you could use some tricky bit manipulation to reduce the size very significantly. It would also require very customized database software.
But this isn't for the purpose of normalization, just compression.
You should normalize it out if you need to avoid the delete anomaly that comes with not breaking it out. That is, if you ever need to answer the question, has my database ever had a person named "Joejimbobjake" in it, you need to avoid the anomaly. Soft deletes is probably a much better way than having a comprehensive first name table (for example), but you get my point.
In addition to all the points everyone else has made, consider that if you were implementing a data entry operation (for example), and were to insert a new contact, you would have to search your first name and last name tables to locate the correct Id's and then use those values. But then this is further complicated by the occasion when the name is not on the FN and/or LN tables, then you have to insert the new first/last name and use the new id(s).
And if you think that you have a comprehensive list of names, think again. I work with a list of over 200k unique first names and I'd guess it represents 99.9% of the US population. But that .1% = a lot of people. And don't forget the foreign names and misspellings...

Conflicting desires in Database Design, with fields of two similar functions

Okay, so I'm making a table right now for "Box Items".
Now, a Box Item, depending on what it's being used for/the status of the item, may end up being related to a "Shipping" box or a "Returns" box.
A Box Item may be defective:if it is, a flag will be set in the Box Item's row (IsDefective), and the Box Item will be put in a "Returns" box (with other items to be returned to that vendor). Otherwise, the Box Item will eventually be put into a "Shipping" box (with other items to be shipped). (Note that Shipping and Returns boxes have their own tables: there's not one common table for all boxes... though maybe I should consider doing that if possible as a third possibility?)
Maybe I'm just not thinking clearly today, but I started questioning what should be done in this situation.
My gut tells me that I should have a separate field for each possible relation, even if only one of the relations can happen at any given time, which would make the schema for Box Items look like:
BoxItemID
Description
IsDefective
ShippingBoxID
ReturnBoxID
etc...
This would make the relations clear, but it seems wasteful (since only one of the relations will be used at any time). So then I thought I could have just one field for the BoxID, and determine which BoxID it's referring to (a Shipping or a Returns Box ID) based on the IsDefective field:
BoxItemID
Description
IsDefective
BoxID
etc...
This seems less wasteful, but doesn't sit right with me. The relation isn't obvious.
So, I put it to you, database gurus of Stackoverflow. What would you do in this situation?
EDIT: Thank you everyone for your input! It's given me a lot to think about. For one, I'm going to use an ORM next time I start a project like this. =) For two, since I'm not right now, I'll bite the four bytes and use two fields.
Thanks everyone again!
I'm with Psychotic Venom and mattlant.
Going the polymorphic route (having to figure out which table your foreign key points to based on the contents of another field) is going to be a pain. Coding the constraints for that maybe tough (I'm not sure most databases would support that natively, I think you'd have to use a trigger).
Do items ever move between the tables? Sticking with two tables with identical definitions where one is for returns and one is for shipping may be the easiest route. If you want to stick with the definition you first proposed (with the two separate fields) is perfectly reasonable.
"Premature optimization is the root of all evil" and all that. While it seems wasteful, remember what you're storing. Since they are IDs they are probably just integers, maybe 4 bytes. Wasting four bytes per record is basically nothing. In fact, due to padding to put things on even addresses or other such things it may be "free" to put that extra field in there. It all depends on the DB design.
Unless you have a very good reason to go the polymorphic route (like you're on an embedded system with little memory or you have to replicate across some really slow 9600bps link) it probably won't be worth the headaches you can end up with. Having to write all those special cases into your queries can get annoying.
Quick example: doing a join between two tables where if you want to join is based on if the isDefective flag is set is going to be a pain. Being able to just use one of the two columns alone is probably enough of a hassle you may save, at least for me.
I would consider making a single table for the boxes and the box type be a column of the box table. This would simplify the relationships and make it easy to still query for box type. So the box item only has one foreign key to the boxId.
I'd use what Hibernate calls Table-per-subclass, so my DB would wind up with 3 tables for Boxes: Box, ShippingBox, and ReturnBox. The FK in BoxItem would point to Box.
What you're talking about is polymorphic relations. A single ID that can reference multiple other tables. There are several frameworks that support this, however, it is (potentially) bad for database integrity (that could be a whole other discussion whether or not your database or your application should maintain referential integrity).
What about this?
BoxItem:
BoxItemID, Description, IsDefective
Box:
BoxID, Description
BoxItemMap:
BoxID, BoxItemID, BoxItemType
Then you can have BoxItemType be an enumeration, or an integer where you define constants in your application as "Return" or "Shipping" as the type of box.
Agree about the polymorphic discussion above, although it has potential to be used poorly, it is still a viable solution.
Basically you have a base table called box. Then you have two other tables, shipping box and return box. Those two add any extra fields that are special to them. they are related to box with a 1:1 fk.Boz base table has the common fields of all box types.
You relate BoxItem with the box table. The way you you get the proper box type is by doing a query that joins the child box with the root box based on the key. The record that has in both the base box and the child box is of that type.
You just have to be careful like mentioned that when you create a box type that it is done correctly. BUt thats what testing is for. The code to add them only needs ot written once. Or use an ORM.
Almost all ORM's support this strategy.
I'd go with just a single BoxItems table with IsDefective, ShippingBoxID, the shipping-box-related fields, ReturnBoxID and the return-box-related fields. Some fields will always be NULL for each record.
This is a very simple and self-evident design that the next developer is unlikely to be confused by. In theory this design is inefficient because of the guaranteed empty fields for each row. In practice, databases tend to have a minimum required storage size for each row anyway, so (unless the number of fields is huge) this design is as efficient as possible anyway, and much easier to code to.
I'd probably go with:
BoxTable:
box_id, box_descrip, box_status_id ...
1, Lovely Box, 1
2, Borked box, 2
3, Ugly Box, 3
4, Flammable Box, 4
BoxStatus:
box_status_id, box_status_name, box_type_id, ....
1,Shippable, 1
2,Return, 2
3,Ugly, 2
4,Dangerous,3
BoxType:
box_type_id, box_type_name, ...
1, Shipping box, ...
2, Return box, ....
3, Hazmat box, ...
That way the Box Status defines the box type, and it's flexible if you need to expand into a few more status levels or box types later on.

Resources