How to find multivalued dependencies that not satisfy R? - database

I'm currently taking the course on DB and the theme is Relational Design Theory. Sub-theme is Multivalued dependencies and I'm completely lost in them :(
I have such a question:
R(A,B,C):
A | B | C
----------
1 | 2 | 3
1 | 3 | 2
1 | 2 | 2
3 | 2 | 1
3 | 2 | 3
Which of the following multivalued dependencies does this instance of R not satisfy?
1. AB ↠ C
2. B ↠ C
3. C ↠ A
4. BC ↠ C
I know that:
A | B | rest
----------
a | b1 | r1
a | b2 | r2
a | b1 | r2
a | b2 | r1
I'm making tables for those dependencies in those manner:
B ↠ C
B | C | A
----------
2 | 3 | 1
2 | 2 | 1
2 | 1 | 3
2 | 3 | 3
3 | 2 | 1
According to my assumptions:
2 2 1 and 2 1 3 are at rule above, so there also should be 2 1 1 and 2 2 3 records.
So this one is not correct. Am I right?
And I don't have any idea how to build such table for AB ↠ C. It will be the same table?
A | B | C
----------
1 | 2 | 3
1 | 3 | 2
1 | 2 | 2
3 | 2 | 1
3 | 2 | 3
Is it something that called trivial dependency?

Let me stretch my mind back to college days, from my Database Modeling Concepts class... Taking a quick refresher course via some online PDFs...
I wrote and rewrote a paragraph here to try to concisely explain the multidetermines operator, but I can't explain it any more clearly than the PDF above. Roughly speaking, if B ↠ C, then for every (B, C, A) tuple that exists, you can swap out any C value in (B, C) in R and still find a tuple in R.
In the case of your last question, you're right. I don't remember the proper term, but since there are no other columns than A, B, C, then AB ↠ C is trivially satisfied in R. To build a table for it, I'd order by A first, then B. It would look something like
A | B | C
---------
1 | 2 | 2
1 | 2 | 3
1 | 3 | 2
3 | 2 | 1
3 | 2 | 3
B ↠ C is a much more interesting example. Since the (B, C, A) tuples (2, 3, 1) and (2, 1, 3) both exist, the tuples (2, 1, 1) and (2, 3, 3) (obtained by swapping C) must both exist to satisfy R. Since (2, 1, 1) does not exist, it does not satisfy R. Writing out the tables like you did makes it quite easy to see.
If you see the examples in the PDF I linked earlier, it puts meaningful names to the columns, which may aid in your understanding. I hope this sets you on the right path!

Related

What is the maximum number of tuples that can be returned by natural join?

Consider that the relation R(A,B,C) contains 200 tuples and relation S(A,D,E) contains 100 tuples, then the maximum number of tuples possible in a natural join of R and S.
Select one:
A. 300
B. 200
C. 100
D. 20000
It will be great if the answer is provided with some explanation.
The maximum number of tuples possible in natural join will be 20000.
You can find what natural join exactly is in this site.
Let us check for the given example:
Let the table R(A,B,C) be in the given format:
A | B | C
---------------
1 | 2 | 4
1 | 6 | 8
1 | 5 | 7
and the table S(A,D,E) be in the given format:
A | D | E
---------------
1 | 2 | 4
1 | 6 | 8
Here, the result of natural join will be:
A | B | C | D | E
--------------------------
1 | 2 | 4 | 2 | 4
1 | 2 | 4 | 6 | 8
1 | 6 | 8 | 2 | 4
1 | 6 | 8 | 6 | 8
1 | 5 | 7 | 2 | 4
1 | 5 | 7 | 6 | 8
Thus we can see the resulting table has 3*2=6 rows. This is the maximum possible value because both the input tables have the same single value in column A (1).
Natural join returns all tuple values that can be formed from (tuple-joining or tuple-unioning) a tuple value from one input relation and a tuple value from the other. Since they could agree on a single subtuple value for the common set of attributes, and there could be unique values for the non-common subtuples within each relation, you could get a unique result tuple from every pairing, although no more than that. So the maximum number of tuples is the product of the tuple counts of the relations.
Here that's D 20000.
A and A present in R and S so according to natural join 100 tuples take part in join process.
Option C 100 is the answer.

SQL query/functions to flatten a multiple table item and hierarchical item links

I have the data structure below, storing items and links between them in parent-child relashionship.
I need to display the result as show below, one line by parent, with all children.
Values are the ItemCodes by item type, for ex. C-1 and C-2 are the 2 first items of type C, and so on.
In a previous application version, there were only one C and one H maximum for each P.
So I did a max() and group by mix and the result was there.
But now, parents may be linked to different types and number of children.
I tried several techniques including adding temporary tables, views, use of PIVOT, ROLLUP, CUBE, stored procedures and cursors (!), but nothing worked for this specific problem.
I finally succeeded to adapt the query. However, there are many select from (select ...) clauses, as well as row_number based queries.
Also, the result is not dynamic, meaning the number of columns is fixed (which is acceptable).
My question is: what would be your approach for such issue (if possible in a single query)? Thank you!
The table structure:
Item
-------------------------------
ItemId | ItemCode | ItemType
-------------------------------
1 | P1 | P
2 | C11 | C
3 | H11 | H
4 | H12 | H
5 | P2 | P
6 | C21 | C
7 | C22 | C
8 | C23 | C
9 | H21 | H
ItemLink
---------------------------------------
LinkId | ParentItemId | ChildItemId
---------------------------------------
1 | 1 | 2
2 | 1 | 3
3 | 1 | 4
4 | 2 | 6
5 | 2 | 7
6 | 2 | 8
7 | 2 | 9
Expcted Result
-----------------------------------------------------
P C-1 C-2 ... C-N H1 H2 ... H-N
-----------------------------------------------------
P1 C11 NULL NULL NULL H11 H12 NULL NULL
P2 C21 C22 C23 NULL H21 NULL NULL NULL
...
Part of my current query (which is working):
!http://s12.postimg.org/r64tgjjnh/SOQuestion.png

Why do these join differently based on size?

In Postgresql, if you unnest two arrays of the same size, they line up each value from one array with one from the other, but if the two arrays are not the same size, it joins each value from one with every value from the other.
select unnest(ARRAY[1, 2, 3, 4, 5]::bigint[]) as id,
unnest(ARRAY['a', 'b', 'c', 'd', 'e']) as value
Will return
1 | "a"
2 | "b"
3 | "c"
4 | "d"
5 | "e"
But
select unnest(ARRAY[1, 2, 3, 4, 5]::bigint[]) as id, -- 5 elements
unnest(ARRAY['a', 'b', 'c', 'd']) as value -- 4 elements
order by id
Will return
1 | "a"
1 | "b"
1 | "c"
1 | "d"
2 | "b"
2 | "a"
2 | "c"
2 | "d"
3 | "b"
3 | "d"
3 | "a"
3 | "c"
4 | "d"
4 | "a"
4 | "c"
4 | "b"
5 | "d"
5 | "c"
5 | "b"
5 | "a"
Why is this? I assume some sort of implicit rule is being used, and I'd like to know if I can do it explicitly (eg if I want the second style when I have matching array sizes, or if I want missing values in one array to be treated as NULL).
Support for set-returning functions in SELECT is a PostgreSQL extension, and an IMO very weird one. It's broadly considered deprecated and best avoided where possible.
Avoid using SRF-in-SELECT where possible
Now that LATERAL is supported in 9.3, one of the two main uses is gone. It used to be necessary to use a set-returning function in SELECT if you wanted to use the output of one SRF as the input to another; that is no longer needed with LATERAL.
The other use will be replaced in 9.4, when WITH ORDINALITY is added, allowing you to preserve the output ordering of a set-returning function. That's currently the main remaining use: to do things like zip the output of two SRFs into a rowset of matched value pairs. WITH ORDINALITY is most anticipated for unnest, but works with any other SRF.
Why the weird output?
The logic that PostgreSQL is using here (for whatever IMO insane reason it was originally introduced in ancient history) is: whenever either function produces output, emit a row. If only one function has produced output, scan the other one's output again to get the rows required. If neither produces output, stop emitting rows.
It's easier to see with generate_series.
regress=> SELECT generate_series(1,2), generate_series(1,2);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
(2 rows)
regress=> SELECT generate_series(1,2), generate_series(1,3);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
1 | 3
2 | 1
1 | 2
2 | 3
(6 rows)
regress=> SELECT generate_series(1,2), generate_series(1,4);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
1 | 3
2 | 4
(4 rows)
In the majority of cases what you really want is a simple cross join of the two, which is a lot saner.
regress=> SELECT a, b FROM generate_series(1,2) a, generate_series(1,2) b;
a | b
---+---
1 | 1
1 | 2
2 | 1
2 | 2
(4 rows)
regress=> SELECT a, b FROM generate_series(1,2) a, generate_series(1,3) b;
a | b
---+---
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
(6 rows)
regress=> SELECT a, b FROM generate_series(1,2) a, generate_series(1,4) b;
a | b
---+---
1 | 1
1 | 2
1 | 3
1 | 4
2 | 1
2 | 2
2 | 3
2 | 4
(8 rows)
The main exception is currently for when you want to run multiple functions in lock-step, pairwise (like a zip), which you cannot currently do with joins.
WITH ORDINALITY
This will be improved in 9.4 with WITH ORDINALITY, a d while it'll be a bit less efficient than a multiple SRF scan in SELECT (unless optimizer improvements are added) it'll be a lot saner.
Say you wanted to pair up 1..3 and 10..40 with nulls for excess elements. Using with ordinality that'd be (PostgreSQL 9.4 only):
regress=# SELECT aval, bval
FROM generate_series(1,3) WITH ORDINALITY a(aval,apos)
RIGHT OUTER JOIN generate_series(1,4) WITH ORDINALITY b(bval, bpos)
ON (apos=bpos);
aval | bval
------+------
1 | 1
2 | 2
3 | 3
| 4
(4 rows)
wheras the srf-in-from would instead return:
regress=# SELECT generate_series(1,3) aval, generate_series(1,4) bval;
aval | bval
------+------
1 | 1
2 | 2
3 | 3
1 | 4
2 | 1
3 | 2
1 | 3
2 | 4
3 | 1
1 | 2
2 | 3
3 | 4
(12 rows)

Trouble understanding meaning of functional dependency notation (A - > BC)

I'm having a hard time visualizing exactly what A->BC means, mainly what exactly BC does.
For example, on a table "If A -> B and B -> C, then A -> C" would look like this, and the statement would be true:
A | B | C
1 | 2 | 3
1 | 2 | 3
What would A -> BC look like?
How would you show something like "If AB -> C, then A -> BC" is false?
Thanks!
EDIT:
My guess at it is that AB -> C means that C is dependant on both A and B, so the table would look like this:
A | B | C
1 | 2 | 3
1 | 2 | 3
Or this (which would be a counterexample for my question above):
A | B | C
1 | 2 | 4
1 | 3 | 4
And both would be true. But this would be false:
A | B | C
1 | 2 | 4
1 | 3 | 5
Is that the right idea?
In case you haven't already read this, it's an okay introduction to functional dependencies. It says:
Union: If X → Y and X → Z, then X → YZ
Decomposition: If X → YZ, then X → Y and X → Z
I find it helpful to read A -> B as "A determines B", and read A -> BC as "A determines B and C". In other words, given an A, you can uniquely determine the value of B and C, but it's not necessarily true that given a B and a C, you can uniquely determine the value of A.
Here's a simple example: a table with at least 3 columns, where A is the primary key and B and C are any other columns:
id | x | y
------------
1 | 7 | 4
2 | 9 | 4
3 | 7 | 6
To show that If AB -> C, then A -> BC is false, you just have to come up with a single counter-example. Here's one: a table where AB is the primary key (therefore by definition it satisfies AB -> C):
A | B | C
------------
1 | 1 | 4
1 | 2 | 5
2 | 1 | 6
2 | 2 | 4
However, it does not satisfy A -> B (because for A=1, B=1,2) and therefore, by Union, it does not satisfy A -> BC. (Bonus points: does it satisfy A -> C? Does it matter?)

C program to find the number of onto functions and to display all the functions

I have to write a C program (for my Discrete Mathematics assignment) that finds the number of onto functions from set A (|A| = m) to set B (|B|=n) and to display all those functions. Number of onto functions I calculated using the code:
for(k=0; k<n; k++)
x = x + pow( -1, k) * combination( n, n - k ) * pow( ( n - k ), m);
Where combination is a function that finds the Number of possible combinations.
For example if A = {1,2,3}, B={a,b,c} then the number of onto functions evaluated from the formula is
3^3 - 3(2^3) + 3 = 6.
One possible solution is f = {(1,a),(2,b),(3,c)} [I know this is a solution].
But my problem is: How to display each and every solution!?
This is just a trivial example. But if m and n values are increased (provided m>=n) then the number of possible onto functions increases exponentially!!
For example if m=7 and n=4 there are 8400 functions!
I can't think of any method to display each and every function that exists between A and B.
I answered a similar problem sometime ago but m and n were equal m = n.(You must think recursively to solve this), by your comment I think the possible answers are: {(1,a)(2,b)(3,c)}, {(2,a)(3,b)(1,c)}, {(3,a)(1,b)(2,c)}, {(3,a)(2,b)(1,c)}, {(2,a)(1,b)(3,c)} and {(1,a)(3,b)(2,c)} then this is my recipe:
Set 2 arrays with their initial value let's call them letters and numbers.
*---*---*---* *---*---*---*
| a | b | c | <---letters. | 1 | 2 | 3 | <---numbers.
*---*---*---* *---*---*---*
Choose one of the arrays to be your pivot, I chose the letters, it will be statical.
*---*---*---* *---*---*---*
| a | b | c | <---STATIC. | 1 | 2 | 3 | <---DYNAMIC.
*---*---*---* *---*---*---*
Rotate the dynamic array counter-clockwise or clockwise as you wish, you must print the i element of numbers with the i element of letters.
*---*---*---* *---*---*---* *---*---*---*
| 1 | 2 | 3 | -(Print)-> | 2 | 3 | 1 | -(Print)-> | 3 | 1 | 2 |
*---*---*---* *---*---*---* *---*---*---*
So you get at this point: {(1,a)(2,b)(3,c)}, {(2,a)(3,b)(1,c)}, {(3,a)(1,b)(2,c)}, 3 are missing.
Swap the i element with the n element of the dynamic array.
*---*---*---* *---*---*---*
| 1 | 2 | 3 | ---------( Swap (0<->2) )-------> | 3 | 2 | 1 |
*---*---*---* *---*---*---*
Repeat the step 3.
*---*---*---* *---*---*---* *---*---*---*
| 3 | 2 | 1 | -(Print)-> | 2 | 1 | 3 | -(Print)-> | 1 | 3 | 2 |
*---*---*---* *---*---*---* *---*---*---*
So you get the missed subsets:{(3,a)(2,b)(1,c)}, {(2,a)(1,b)(3,c)} and {(1,a)(3,b)(2,c)}.
If you have more than 3 example 4. Easy: 1234 (rotate N times where N is the number of variables and print with each movement), swap 1 and 4 -> 4231 (Rotate and Print), swap 2 and 3 -> 4321 (Rotate and Print), swap 4 and 1 --> 1324 (Rotate and Print).
I hope this helped.

Resources