Count in Relational Algebra - database

I need to query the number of apartments where all rental contract are signed by occupants from the same nationality
I tried something like this:
π numberapartments
y nationality; numberapartments<--Count(a_id)
And I also need some joins somewhere, I don't know.
How could I do this query?
Thanks.
You can find the schema here

Here are some questions to guide you through a composition of a query like this homework.
When giving tables say exactly what a row says about the business situation in terms of its column values when it is in the table. Also when describing a query result.
What is a query returning rows where
occupant O rents apartment A from date S to date E? Why?
O rents A from a date to a date? Why?
O rents A? Why?
O from nation N rents A? Why?
an occupant from N rents A? Why?
C = the # of nations where an occupant from one rents A? Why?
C = the # of nations where an occupant from one rents A AND C = 1? Why?
a # = the # of nations where an occupant from one rents A & that # = 1? Why?
(the # of nations where an occupant from one rents A) = 1? Why?
What rows are in
Rental?
Occupant?
the result of your desired query? Why?
Re relational querying.
It isn't actually necessary to use counting or grouping to write your query. Such queries that are of the form "rows where … all …" can typically be written using (some variant of) relational division or associated idioms.

Related

RA translation to natural language

so im stuck in this exercise where I need to translate relational algebra (unary relational operations) expressions based on the Mondial III database to natural language and I need help for the last two and if I have any errors in the ones I answered. BTW i used 6 for sigma (SELECT operation) and |><| for the THETA JOIN operation (couldn't find the sigma or the real theta join operator on my keyboard sorry about that) Any help is much appreciated!Thanks in advance.
Here's the meaning for symbols :
SELECT :
Selects all tuples that satisfy the selection condition from a relation R :
6selection condition(R)
PROJECT : Produces a new relation with only some of the attributes of R, and removes duplicates tuples :
πattribute list(R)
THETHA JOIN : Produces all combinations of tuples from R1 and R2 that satisfy the join condition :
R1< |><|join condition >(R2)
πname(6elevation>1000(MOUNTAIN)) -> Find the name of all mountains whose elevation is higher than 1000.
6elevation>1000(6population>100000(CITY)) -> Select the city's tuples whose elevation is higher than 1000 with a population greater than 100000
6population>100000(6elevation>1000(CITY)) -> Select the city's tuples whose population is greater than 100000 with an elevation higher than 1000
COUNTRY|><|code=country(LANGUAGE) -> ?
πCountry.name(COUNTRY|><|code=country(6Language.name='English' AND percentage>50(LANGUAGE)) -> ?
The fourth expression returns all the informations about the countries together with all the languages spoken (the information about the country is repeated for each different language spoken).
The fifth expression return the name of all the countries where the prevalent language is English.

finding max value among two table without using max function in relational algebra

Suppose I have two tables A{int m} and B{int m} and I have to find maximum m among two tables using relational algebra but I cannot use max function.How can I do it?I think using join we can do it but i am not sure if my guess is correct or not.
Note: this is an interview question.
Hmm, I'm puzzled why the question involves two tables. For the question as asked, I would just UNION the two (as StilesCrisis has done), then solve for a single table.
So: how to find the maximum m in a table using only NatJOIN? This is a simplified version of finding the top node on a table that holds a hierarchy (think assembly/component explosions or org charts).
The key idea is that we need to 'copy' the table into something with a different attribute name so that we can compare the tuples pair-wise. (And this will therefore use the degenerate form of NatJOIN aka cross-product). See example here How can I find MAX with relational algebra?
A NOT MATCHING
((A x (A RENAME m AS mm)) WHERE m < mm)
The subtrahend is all tuples with m less than some other tuples. The anti-join is all the tuples except those -- ie the MAX. (Using NOT MATCHING I think is both more understandable than MINUS, and doesn't need the relations to be UNION-compatible. It's roughly equivalent to SQL NOT EXISTS).)
[I've used Tutorial D syntax, to avoid mucking about with greek letters.]
SELECT M FROM (SELECT M FROM A UNION SELECT M FROM B) ORDER BY M DESC LIMIT 1
This doesn't use MAX, just plain vanilla SQL.

Estimating a Size of Joining a Relation with itself

I'm studying size estimation of logical query plans in order to select a physical query plan.
I was wondering what is the size of joining (natural join) a relation to itself?
e.g R(a,b) JOIN R(a,b), say total number of tuples is 100 and attributes a and b both has a distinct values of 20.
Will the join size (number of tuples in result) equal to 100?
I'm so confused!
To answer the question as asked:
Natural join of a relation to itself is the identity operation; you'll get exactly the tuples you started with (yes, 100 tuples in this case).
The equivalent SQL for what you ask is:
SELECT R1.a, R1.b FROM R AS R1, R As R2 WHERE R1.a = R2.a AND R1.b = R2.b
This is because RA's (Natural) Join always matches by attribute name.
What could be more sensible? What's to be confused about?

Functional dependencies is DBMS - key

I am reading the book database management systems by Ramakrishnan, and in the chapter related to schema refinement and normal forms, i saw a sentence saying:
K is a candidate key for R means that K ----> R , where R is the relation.
We also have the decomposition rule:
If X ---->YZ, then X----->Y and X----->Z
Then, my question is, for example let R=XABCDE and X be the key. Then, since X--->XABCDE, using the second rule repeatedly, we can say X-->A, X--->B, and so on. Then that means X determines all of the attributes. But i am confused here:Then we cannot have a row in the table such that for the same X value, there is a different A value. For example, let X be the id number of a person attribute, and A be the model of the car that person has. Then a person cannot have two cars, but we do not have such a constraint, it must be able to have two or more cars.
What am i doing wrong here? Can anyone help?
Thanks
For example, let X be the id number of a person attribute, and A be the model of the car that person has. Then a person cannot have two cars, but we do not have such a constraint, it must be able to have two or more cars.
What am i doing wrong here? Can anyone help?
You went wrong before you started normalizing R.
Part of the job of a database designer is to decide what the database is supposed to store. This has nothing to do with normalization. In textbook problems, this part is done before the problem is presented to you.
If you start with R{XABCDE}, where "X" is a person's ID number, and "A" is a kind of car, sample data for R might look like this.
person_id car_model B C D E
--
1 Buick Wildcat ...
2 Toyota Corolla ...
3 Honda Accord ...
Or it might look like this.
person_id car_model B C D E
--
1 Buick Wildcat, Nissan Sentra ...
2 Toyota Corolla ...
3 Honda Accord ...
Or it might look like this.
person_id car_model B C D E
--
1 Buick Wildcat ...
1 Nissan Sentra ...
2 Toyota Corolla ...
3 Honda Accord ...
The first example suggests that you want to store only one car per person. That's a defensible design decision (unless the database needs to know how many cars each person has). Universities rarely care how many cars you have; they just want to know which one is supposed to have a parking sticker.
Deciding what to store has nothing to do with normalization.
The other examples suggest that you want to store more than one car per person, in which case you need to do some normalization at the very least (in the second example) or reconsider your choice of primary key at the very least (in the third example).
Once you've decided what to store, you can start normalizing. Really, how could you start normalizing before you decide what to store? That would be impossible.
In relation R(XABCDE), if X is a key then for any value of X the relation only permits one value for A,B,C,D and E at any point in time. If that constraint doesn't match the reality you intended to model then maybe X was the wrong choice of key.

Relational Algebra Query Troubles

I have a problem where I have two relations, one containing attributes song_id, song_name, album_id, and the other containing album_id and album_name. I need to find the names of all the albums that do not have songs in the song relation. The problem is I can only use Rename, Projection, Selection, Grouping(with sum,min,max,count), Cartesian Product, and Natural join. I have spent a good amount of time working on this and would appreciate any help that pointed me in the right direction.
As #ErwinSmout pointed out, difference is a generally easy way to do it. But since you can't use it, there is a tricky workaround using counts. I'm assuming that every album_id present in the songs relation is also present in the albums relation.
PROJECT album_id from the songs relation (note that relational algebra's PROJECT is equivalent to SQL's SELECT DISTINCT). I'll call this relation song_albums. Now lets take the count of the albums relation, call this m, and take the count of the new table, call this n.
Take the Cartesian product of the albums relation and the song_albums relation. This new relation has m*n rows. Now if you do a count, grouped by album_name, each of the m album_name's will have a count of n. Not very helpful.
But now, we SELECT from the relation rows where albums.album_id != song_albums.album_id. Now, if you do a count grouped by album_name, the count for those albums that were not in the original songs relation will be n, while those that were originally in there will have a count less than n, since rows would have been removed based on how many songs with that album were in the original songs relation.
Edit: As it turns out, this isn't a strictly relational-algebra solution: In SQL, a 1 x 1 table, such as the one containing n can simply be treated as an integer and used in an equality comparison. However, according to Wikipedia, selection must make a comparison between either two attributes of a relation, or an attribute and a constant value.
Another obstacle which will be dealt with by another ill-recommended Cartesian product: we can take the Cartesian product of the 1 x 1 relation containing n with our most recent relation. Now we can make a proper relational-algebra selection since we have an attribute that is always equal to n.
Since this has gotten rather complex, here is a relational-algebra expression capturing the above english explanation:
Note that n is a 1 x 1 relation with an attribute named "count".
It's impossible. The problem includes a negation, and in relational algebra, that can only be epxressed using relational difference, which you're seemingly not allowed to use.
I'm curious to see what your teacher presents as the solution to this problem.

Resources