Dividing relations to achieve BCNF - database

Is the relation from below correctly divided into relations in BCNF:
R(a,b,c,d,e) - a and b are primary keys and there are dependencies such as:
a → c
a → e
c → e
I split the above relations into:
AC(a,c)
CE(c,e)
AB(a,b,d)

Is it the case that a is a primary key and b is a primary key, or is it the case that {a,b} is the (composite) primary key? If the columns are separately primary keys, then you have a number of additional but not explicitly stated functional dependencies: a → bd and b → acde. If the columns {a,b} are a composite PK, then you have an additional functional dependency ab → cde. Either way, the AC and CE relations are fine, and the ABD relation is the other necessary one. The only issue is 'what are the candidate keys of ABD'? And the answer is 'either {a,b} as a composite PK, or a and b as two separate candidate keys'.

Are you sure about that primary key? Normally, determining all the candidate keys is part of these kinds of exercises.
An informal way of expressing what we know about candidate keys is that every attribute that's not on the right-hand side (RHS) of any functional dependency must be part of every candidate key.
Since I don't know how you determined that {ab} is a candidate key, I'd be inclined to say that, because {abd} is not on any RHS, {abd} must be part of every candidate key.
In short, your FDs say that {abd} is the primary key, not {ab}.
In order for your key and your decomposition to be right, you need to have the additional FD ab->d.

Related

Is it in 2NF or not?

R1(A, B, C, D) is a relation. It is specified that every attribute has only atomic values, and there is a set of dependencies (BD->C, C->A). It's clear to me that the relation is in the 1NF, but is it in the 2NF? I mean, BD is obviously a primary key, and we can conclude that BD->A, so all the attributes depend on the key. It isn't in the 3NF, for sure, because the 3NF doesn't accept transitive dependencies, but this shouldn't be a problem for the 2NF. I'm having doubts because some people told me that this couldn't be in 2NF. Is my reasoning correct? Is it in 2NF or not?
A relation schema is in 2NF if any non-prime (i.e. not belonging to a candidate key) attribute is fully functionally dependent on a candidate key.
This definition implies that, if a dependency X → A can be derived in which A is not a prime attribute and X is a proper subset of a candidate key, then such dependency violates the 2NF.
Since the (only) candidate key of this relation is BD, the attributes A and C are non-primes.
Since BD → C, is given, and BD → A can be derived, while neither B → A or D → A can be derived, then the relation is in 2NF.
Note that the 2NF has only historical interest, and the normalization process is discussed in many books (and formal algorithms are presented) only for BCNF, 3NF and higher normal forms.

BCNF and 3NF property

I read a statement that " If a relation is in 3NF and does not contain any overlapping candidate key then it is definitely in BCNF"
Suppose we consider a relation R(A,B,C,D) with following functional dependencies:-
AB --> CD
C --> A
Here only candidate key is AB and the resulting relation is in 3NF and not in BCNF because C is not a super key.
So the above statement doesn't hold true.
Where am I going wrong ?
Your relation has overlapping candidate keys. While it doesn't appear on the left-hand side of the given functional dependencies, we can derive the fact that BC is a candidate key.
Starting with C -> A, we can use Armstrong's Axiom of Augmentation to determine that CB -> AB, and since it's known that AB is a candidate key, that means all other attributes are determined.

Does this qualify as a partial dependency?

Proper subset of candidate key along with non prime attribute determines non prime. Then is it partial dependency or in 2NF?
CAND KEY {AB} then BD-->C .
This is how you define a partial dependency: For example you have a relation R with columns A|B|C|D.
Based on the functional dependency defined by business AB is designated as primary key (candidate key) and B -> D FD exists.
In that case, even though AB is PK; but non key column D is uniquely identified by only B (part of the key composition) and not by the entire key composition AB.
So, D here is partially dependent on key column and thus hold a partial functional dependency which is against 2NF.
In your case, CK AB is not uniquely identifying all fields; C is partially dependent on B; moreover, C is dependent on non-key column D.
So, it has PFD (partial functional dependency) and surely not in 2NF.

Normalization 3NF and BCNF

If I have the following relation R = (A, B, C, D)
And the functional dependencies:
A -> B, B -> A, CDB -> A, CDA -> B
The candidate keys are CDA and CDB.
The third normal form says that there can not be a functional dependency between non-prime attributes. A non-prime attribute is an attribute that doesn't occur in one of the candidate keys. Then that means that this relation already is 3NF since both A and B, that depend on each other, are part of one of the candidate keys, am I right?
If so, I have another question about BCNF. BCNF says that every determinant must be a candidate key. In this case, A and B are not candidate keys, so that violates BCNF, or am I missing something here?
Thanks.
If the three FDs you have given are supposed to be a canonical cover of the FDs satisfied by R then you are right to conclude that CDA and CDB must be candidate keys. (You didn't state the FDs are canonical and if not then there are other ways to satisfy the same dependencies but I guess the intent of the question is that the candidate keys must be inferred only from what you are given.)
If CDA and CDB are in fact the candidate keys of R then you are right that R satisfies 3NF but not BCNF.

Superkey vs Primary Key

I'm having issues understanding super keys in a relation when the relation only contains one functional dependency.
When considering a relation R(A,B,C,D,E) where A is the primary key and with the functional dependency A->B, can A be considered a super key to the relation since there is only one FD? Or would one need to expand the functional dependency to include the unmentioned parts of the relation (C,D,E) in order to find a super key?
I'm mainly confused because all the material I've seen on the web until this point has contained multiple functional dependencies all of which have contained all of the attributes within the relation, so I'm not sure how to interpret the unused attributes. If someone could help clarify this I'd appreciate it!
If the only explicitly mentioned FD for the relation schema is A ⟶ B, then there is the implicit, trivial FD {A,B,C,D,E} ⟶ {A,B,C,D,E}. Given that A ⟶ B, we can deduce that:
{A,C,D,E} is the primary key of R.
There is a partial key dependency for A ⟶ B so R is not in BCNF and the relation schema R should be broken down into two non-loss projections:
R1(A,B) with A ⟶ B so A is the primary key.
R2(A,C,D,E) which is all key (primary key is the combination {A,C,D,E}).
And R1 ⋈ R2 ≡ R (assuming ⋈ is the join operation).
As Catcall says in comments, there is mention of A being the primary key of R. If A is a primary key, then each of the singleton FDs A ⟶ B, A ⟶ C, A ⟶ D and A ⟶ E (or, collectively, A ⟶ {B,C,D,E}) applies, and there's no need to call out A ⟶ B separately. If A is a primary key of R, there's no need to decompose R at all; there is really nothing interesting about the table because it is normalized to 5NF given the available information (assuming there are no unstated non-trivial FDs, MVDs or JDs).
If A is a primary key, then it is also a superkey, but not a proper superkey. Any of the attribute combinations with A and one or more of the other attributes is also a superkey, and is a proper superkey.
When considering a relation R(A,B,C,D,E) where A is the primary key
and with the functional dependency A->B, can A be considered a super
key to the relation since there is only one FD?
If it's given that A is the primary key of R, then by definition you have the FD A->BCDE. It also follows that
A->B
A->C
A->D
A->E
If A is a primary key of R, then it follows by definition that A is also a candidate key and a superkey of R.
Lets make it simple:
Here is a definition for Super,candidate and primary keys
Super Keys
Super key stands for superset of a key.
A Super Key is a set of one or more attributes that are taken collectively and can identify all other attributes uniquely.
Candidate Keys
Candidate Keys are super keys for which no proper subset is a super key.
In other words candidate keys are minimal super keys.
Primary Key:
It is a candidate key that is chosen by the database designer to identify entities with in an entity set.
Primary key is the minimal super keys. In the ER diagram primary key is represented by underlining the primary key attribute.
Ideally a primary key is composed of only a single attribute. But it is possible to have a primary key composed of more than one attribute.

Resources