Multivalued dependency example tricky - database

Employé3: {noEmp, ability, country}
I have this little set of attributes and the following restrictions: Each employee may have some abilities in relation with a certain country. For instance, Alfred can cook Italian and Chinese food and can write in french.
My problem here is I cant decide what DM would be the best solution. I tried use
noEmp,country ->> aptitude, but it bogs me. It says that I can have two tuples with same (noEmp,country), but not necessarily same aptitude. OK!, but is it enough?
I thought about using noEmp->>country,ability, but it doesn't seems to express the relation between the ability and the country.
Of course, all of these DM's are trivial, because it complains all the attributes, so maybe its a silly question...
Just another question: What about the keys? Can I use the DM to determinate it? At first I thought no, because the key must be single. But in this case, I would be forced to use all attributes as keys, what its a little strange, how could I possibly have a 4FN relation if I can't use the DM's to determinate something?

The pair (ability, country) is compound attribute. Let's call it ethnic_ability for the lack of better term. Compound attributes are complex domains, which are flattened into multiple columns of primitive datatypes. Examples: (yyyy_mm_ss_date, hh_mm_daytime), (first_name,last_name), (integer_part_of_real_number,decimal). From DM perspective compound attribute can be considered atomic. Therefore, you have a table with two columns {noEmp, ethnic_ability}, and there is not much what dependency theory can say about binary predicates.

Related

Does the min/max notation relationship matche what i am trying to get?

is this the right way to represent this relationship which is described in text on the picture? this is in min/max notation
http://s7.postimg.org/holux2uwb/image.jpg
There is a huge lack of context here. I'll just kick a answer blindly.
In many cases while modeling data an order is usually seen as an event. I do not know exactly what is a "Bugel Card", but if it is a name of an identity such as a noun, and it has properties/attributes that must be stored, as I suspect it is the Customer, then we have two entities that have a relationship: the Customer entity, and the Bugel Card entity. The resulting connection/relationship/link forms the Order event.
If in an Order a Customer ALWAYS uses AT LEAST 1 "Bugel Card", and not more than that, then we have a cardinality (following the notation min max) of (1,1) between Customer and Bugel Card Entities, in both sides. For relationships (1,1) it takes the data modeler's discretion on which side will be set the relationship between the entities, that is, where the foreign key will go (once you decompose the Conceptual Model). It is always recommended to leave the foreign key on the side where in the future the relationship can become "many".
If you can improve a little more the context here, I can give you an answer with more accuracy (more correct), and remember:
Do not model data without a full context. When you go to an Entity Relationship Diagram starting from the Conceptual Model, you need a context, and one that is very well described. Without a full context, there is no diagram, and as a result, there is no database schema (or much less a system to use and manage).
Other than that, it is not possible to model entities without properties / attributes. Without them, an entity is nothing, because in its decomposition there will be no column to be created, and soon there will be no data to be persisted. Even if in your modeling process you let to define the attributes later you can end up confusing yourself and/or forgetting something. This is something prone to errors.
To be honest, there is no standard way of modeling data. What I have spoken so far are just data modeling tips. It is up to you what you want to do, and how you want to do.
Any questions, or anything else you need, please comment and I help you.

Tips while 'normalizing' databases

I would be grateful if someone wrote how I should look for databases normalization errors in databases AND in entity classes in any language.
I just would like to know what is the most important and where should I look for possible errors in classes - in DAOs, BEANS or wherever. What should I take into account - any conventions, schemes etc?
For any answer, thanks in advance! :)
I guess you've read something about the normal forms, e.g. on wikipedia. Then I guess you know something, but you are not sure why should you do that or what is really important.
For example, if you have a table that contains relations between persons, it should not contain names, just IDs. If you have e.g. a table of patients where there are columns father_name and mother_name, it's an example of non-normalized table causing troubles.
Let's say the mother changes her name - from this moment on, your database is in inconsistent state. You decide to add some cascade/trigger on this change and you get into even worse problem: You realize several people can have the same name.
That is basically the main reason for using IDs as keys, not some column that is not a unique identifier. There is much more to learn, I hope someone provides you a link to some tutorial, as this is not really Q/A stuff.
Another good reason for normalizing a table are sparse tables - tables where some columns rarely contain anything else than null. E.g., there are four types of some device, each has different properties that are left null on the other types. In this case, creating a table that holds the specific properties of each device type (even though it's just {0,1}:1 relation) is advisable.

Generic relation for database

I have to design a generic entity that would be able to refer to variated other entities.
In my example, that would be a commentary entity inside a web application. You could post commentaries on to users, classifieds, articles, varieties (botanical ones), and so on.
So that entity would be made like this:
As a matter of fact, the design (kind of) pattern would be this one:
What are the pros and cons of this kind of pattern?
What I see is:
Pros
It decreases the number of entities if the concept is the same (commentaries for example);
You can therefore easily manipulate heterogeneous objects;
You can aggregate these objects easily (e.g. this user's last commentaries in the whole site, presented easily in a same thread);
Cons
This allows you to fall in the ugly (you use it outrageously and your database and source code are ugly);
There is no control in the database, and this one must therefore be done inside the application code.
What are the performances impacts?
Conclusion
Is this kind of pattern suitable for a relational database? How can we do then?
Thank you by advance.
One more con :
This scheme relies on a mapping between values and names for the "entities" referred to by those values. Think of all the fun you'll have resolving issues that in the TEST system, the ORDER entity has number 734 but in production, it has number 256. You can use the entity names themselves as the values of your entity_id stuff, but you will never be able to avoid hardcoding values for them in your programs (or, say, in view definitions) anyway. Thereby defeating whatever advantage it was you thought you could win.
This kind of scheme is a disease mostly suffered by OO programmers. They see structures that are largely similar and they have this instinctive reflex "I must find a way to resue the existing thing for this". Forgetting that database design is not program design.
EDIT
(if it wasn't clear, this means my answer to your question "Is this kind of pattern suitable for a relational database?" is a principled "NO".)
This is the classic Polymorphic Association anti-pattern. There are a number of possible solutions:
1) Exclusive Arcs e.g. for the Commentary entity
Id
User_Id
Classified_Id
Article_Id
Variety_Id
Where User_Id, Classified_Id, Article_Id and Variety_Id are nullable and exactly one must be not null.
2) Reverse the Relationship e.g remove the Target_Entity and Target_Entity_Id from the Commentary entity and create four new entities
User_Commentary
Commentary_Id
User_Id
Classified_Commentary
Commentary_Id
Classified_Id
Article_Commentary
Commentary_Id
Article_Id
Variety_Commentary
Commentary_Id
Variety_Id
Where Commentary_Id is unique and relates to the Id in Commentary.
3) Create a super-type entity for User, Classified, Article and Variety and have the Commentary entity reference the unique attribute of this new entity.
You would need to decide which of these approaches you feel is most appropriate in your specific situation.

Naming database table fields to designate relationships?

Lets say I have tables Student and Mentor
Does anyone use naming convention for relationship as below? I think this way is good to see the relationships quickly. Would anyone suggest a better way?
Student
StudentID
StudentName
Student2MentorID
To start from scratch, - you probably know this already - there are several ways to represent your database schema, I mean, by using diagrams, for example ER-diagrams that helps you (and your team) stay up to date with your database's design and thus making it simpler to understand.
Now, personally when it comes to implementation, I do use some kind of naming-convention. For example:
For large projects, I use double underscores to split between table categories, (ie. hr__personnel, hr__clocks, hr__timetable, vehicles__cars, vehicles__trips) and so on.
Now, having a relationship between two tables, I do Include both (or all) of the involved table names. (ie. hr__personnel_timetable, vehicles__cars_trips, etc)
Sometimes, (as we all know), we cannot follow strictly a standard, so in those cases I use my own criteria when choosing large relationships' names.
As a rule, I also name table attributes by a three-letter preffix. For example, in my table trips, my fields will be tri_id,tri_distance, tri_elapsed
Note also, that in the above item, I didn't include a Foreign Key. So here I go then. When it comes to FK's, It's easy for me (and my team) to realize that the field IS a FK.
If we follow the previous example, I would like to know who drives in each trip (to make it easier, we assume that only one person drives one trip). So my table now is something like this: tri_id, per_id, tri_distance, tri_elapsed. Now you can easily realize that per_id is just a foreign field of the table. Just, another hint to help.
Just by following these simple steps, you will save hours, and probably some headaches too.
Hope this helps.
I think: you can add prefix (3 letters) to table depending that module represents (scholar,sales,store)
module: scholar ->sc
table: scStudent ( IdStudent,nameStudent..)
table: scMentor(IdMentor,nameMentor...)
relationship
scMentorStudent (IdMentorStudent pk..)
You can use Microsoft's EF notation :
http://weblogs.asp.net/jamauss/pages/DatabaseNamingConventions.aspx
It is better to use underscores...
I suggest to simply use existing naming convention rules such as this one:
http://www.oracle-base.com/articles/misc/naming-conventions.php

Table and column naming conventions when plural and singular forms are odd or the same

In my search I found mostly arguments for whether to use plurality in database naming conventions, and ways to handle it in either case. I have decided I prefer plural table names, so I don't want to argue that.
I need to represent an animal's species and genus and so on in a database. The plural and singular form for 'species' are the same, and the plural of 'genus' is 'genera'.
I'm using Microsoft's Entity Data Model, by the way. My concern is mainly about whether I'll have problems later on depending on my naming choices.
I think I can get by with:
Table: Genera | Column: Genus
But I'm unsure how I should handle:
Table: Species | Column: Species
If I really wanted to be lazy about this I'd just name them 'species > specie' and 'genuses > genus', but I would prefer to read them in their correct forms.
Any advice would be appreciated.
I would go for Genera/Genus and Species/Species. That's how you say it in English, so why using an incorrect form of the word?
I generally avoid have a column name that is the same as a table name because it can be confusing to human readers. The database engine knows whether it expects a table name or column name in any given context, I don't recall that ever being a problem. (Is there some context where either would be valid? I can't think of one.)
That said, if you run into this issue, it indicates to me that you have a poorly chosen name for one or the other. Species makes good sense as a table name: this table contains data about a species. So if a field in that table is called "species" ... what about the species? Presumably everything in the table is about a species. I'd guess it was probably some sort of identifier and not, say, the number of chromosomes or method of reproduction. But is it an ID number? An abbreviation? The common name? The binomial nomenclature name? Etc. If it's, say, the common name, I'd call it "common_name" and not "species".
By the way, another naming convention you should decide on is whether column names that could be ambiguous if taken out of context should have names that specify the context, or whether you use the table name to eliminate the ambiguity. For example, you could have many things that have a "name". You could call any such field simply "name", and if there's ambiguity, qualify it, like "species.name", "laboratory.name", etc. Or you could give each field a unique name, like "species_name", "laboratory_name", etc. That's one of those questions that I think has no definitively right answer, just pros and cons and make a decision and be consistent.

Resources