Is my Relational Algebra correct? - database

I have a database assignment which I have to create some relational algebra for two problems. I feel fairly all right with the majority of it, but I just get confused when trying to project attributes out of a table which is joined to another table.
for example is this correct?
Q1) List the details of incidences with no calls made, so that the receptionist knows
which incidents still need to be called in.
RESULT <-- PROJECT<STUDENT.FirstName, STUDENT.LastName, STAFF.FirstName,
STAFF.INCIDENT.LastName, INCIDENT.DateTimeReported,
INCIDENT.NatureOfIllness(SELECTINCIDENT.DecisionMade =
''(Staff RIGHT JOIN<STAFF.StaffID = INCIDENT.StaffID>
(INCIDENT LEFT JOIN<INCIDENT.StudentID = STUDENT.StudentID>(STUDENT))))
The SQL which I am trying to interpret into relational algebra is:
SELECT
s.FirstName, s.LastName, st.FirstName, st.LastName
, i.DateTimeReported, i.NatureOfIllness
FROM Student s
RIGHT JOIN Incident i ON s.StudentID = i.StudentID
LEFT JOIN Staff st ON st.StaffID = i.StaffID
WHERE i.DecisionMade = ''
Any points of advice would be much appreciated.

It's usually (some exceptions apply, of course) easier to read and understand the sql if you write it all with LEFT JOINs:
SELECT s.FirstName, s.LastName, st.FirstName, st.LastName, i.DateTimeReported, i.NatureOfIllness
FROM Incident i
LEFT JOIN Student s ON s.StudentID = i.StudentID
LEFT JOIN Staff st ON st.StaffID = i.StaffID
WHERE i.DecisionMade = ''

Your version seems correct, except for some typos like STAFF.INCIDENT.LastName. Here's my version:
RESULT <---
PROJECT <STUDENT.FirstName, STUDENT.LastName,
STAFF.FirstName, STAFF.LastName,
INCIDENT.DateTimeReported, INCIDENT.NatureOfIllness>
(SELECT <INCIDENT.DecisionMade = ''>
((STUDENT RIGHT JOIN <STUDENT.StudentID = INCIDENT.StudentID> INCIDENT)
LEFT JOIN <INCIDENT.StaffID = STAFF.StaffID> STAFF)

Related

Interview question help on relatively basic JOIN and subqueries

I was asked to:
Print the following sequence of columns for each plant that only blooms in one type of weather.
WEATHER_TYPE
PLANT_NAME"
Schema
PLANTS (table name)
PLANT_NAME, string, The name of the plant. This is the primary key.
PLANT_SPECIES, sting, The species of the plant.
SEED_DATE, date, The date the seed was planted.
WEATHER (table name)
PLANT_SPECIES, string, The species of the plant.
WEATHER_TYPE, string, The type of weather in which the plant will bloom.
I wrote the script below and tested it against sample input and achieved a desired result. I don't know if this is what is considered a 'printed' result.
Seeking understanding on what I might have missed. How might I make this script 'more efficient' and/or 'better' and/or 'more robust'?
SELECT WEATHER.WEATHER_TYPE, a.PLANT_NAME
FROM (SELECT b.PLANT_NAME, b.PLANT_SPECIES
FROM (SELECT PLANTS.PLANT_NAME, PLANTS.PLANT_SPECIES, PLANTS.SEED_DATE, WEATHER.WEATHER_TYPE
FROM PLANTS JOIN WEATHER
ON PLANTS.PLANT_SPECIES = WEATHER.PLANT_SPECIES) b
GROUP BY b.PLANT_NAME, b.PLANT_SPECIES
HAVING count(*) = 1) a JOIN WEATHER
ON a.PLANT_SPECIES = WEATHER.PLANT_SPECIES
I achieved the expected result in a SQL Server Management Studio window, but not sure if it's the 'printed' result the question-askers are looking for.
I personally consider CTEs easier to read and to debug, compared to nested "Table Expressions", as you have done. I would have done something like:
with
x as (
select p.plant_name
from plants p
join weather w on w.plant_species = p.plant_species
group by p.plant_name
having count(*) = 1
)
select x.plant_name, w.weather_type
from x
join weather w on w.plant_species = x.plant_species
I have to agree with The Impaler in regards to the readability and ease of debugging nested table expressions. As another option to the CTE (which is really the better choice), if you really want to nest things without overthinking it you can use a correlated subquery. It's easier to read, though as your result set grows you'll lose efficiency.
SELECT w.weather_type, p.plant_name
FROM plants p
JOIN weather w
ON w.plant_species = p.plant_species
WHERE (SELECT COUNT(1) FROM dbo.weather WHERE plant_species = w.plant_species) = 1
or with grouping...
SELECT w.weather_type, p.plant_name
FROM plants p
JOIN weather w
ON w.plant_species = p.plant_species
WHERE w.plant_species IN (SELECT plant_species FROM dbo.weather GROUP BY plant_species HAVING COUNT(1) = 1)
SELECT w.weather_type, p.plant_name
FROM plants p
JOIN weather w
ON w.plant_species = p.plant_species
WHERE w.weather_type="Sunny";

Syntax error(missing operator) in query expression in MSAccess

I have written a query in MS Access when i am trying to run this query i am getting an error. I can't find out the problem in it.
SELECT
p.[ID] as [ID],
p.[Code] as [CODE],
p.[DESCRIPTION] as [DESCRIPTION],
p.[Coloring] as [Coloring],
p.[Sizing] as [Sizing],
p.[BarCode] as [Barcode],
p.[PartsNo] as [PartsNo],
p.[HSN_SAC] as [HSN_SAC],
p.[GSTRate] as [GSTRate],
p.[Remarks] as Remarks,
c.[CODE] as [CategoryCode],
c.[Description] as [CategoryDescription],
b.[CODE] as [BrandCode],
b.[Description] as [BrandDescription],
s.[Id] as [SupplierId],
s.[Code] as [SupplierCode],
s.[Description] as [SupplierDescription]
FROM [PRODUCTMASTER] p LEFT JOIN [CATEGORYMASTER] c on p.[CategoryId] = c.[ID]
LEFT JOIN [BRANDMASTER ] b on p.[BrandId] = b.[ID]
LEFT JOIN [SUPPLIERAMSTER] s on p.[SupplierId] = s.[ID]
When you linking more than two tables, brackets required:
FROM (([PRODUCTMASTER] p LEFT JOIN [CATEGORYMASTER] c on p.[CategoryId] = c.[ID])
LEFT JOIN [BRANDMASTER ] b on p.[BrandId] = b.[ID])
LEFT JOIN [SUPPLIERAMSTER] s on p.[SupplierId] = s.[ID]
I would recommend to build SQL queries using query builer, it's much easier than manually and you won't have misspelling and bracketing errors like this. Check one more time for the space after [BRANDMASTER ], this is bad practice in any case. Remove trailing space from column name in table definition, it may cause other weird errors.

Access Query, which is generated from another query, takes a long time to run

I am adding some new features to a MS Access DB for a client. The original DB has several queries. The new features that I need to add require that I re-use these queries and incorporate them in my VBA and SQL code.
For example, I am generating a new query (via VBA and SQL) based one of these previous queries. I then export the result as an excel file.
However, whenever I try to run one of the new queries it takes about 15 minutes to complete. During this time there is message in the bottom right of the screen which says "running query. "
Here is one of the SQL queries that I am running. Please note that it ran quickly when there was only one WHERE condition.
SELECT
StudentProgram.fkCohortID AS [Cohort],
Student.pkStudentID AS [Student ID],
Student.EmplID AS [Employee ID],
Student.LastName AS [Last Name],
Student.FirstName AS [First Name],
PostBaccActivity.fkSemesterID AS [Semester],
PostBaccActivity.fkPostBaccID AS [PostBacc],
PostBaccActivity.fkGradSchoolID AS [GradSchool],
PostBaccActivity.ProjectTitle AS [ProjectTitle],
PostBaccActivity.fkFacultyID AS [Faculty],
PostBaccActivity.BeginDate AS [BeginDate],
PostBaccActivity.EndDate AS [EndDate],
PostBaccActivity.Status AS [Status]
FROM qryRptJoinAll
WHERE
qryRptJoinAll.StudentProgram.fkCohortID BETWEEN 1 AND 12
OR qryRptJoinAll.StudentProgram.fkCohortID = 25
OR qryRptJoinAll.StudentProgram.fkCohortID = 28
OR qryRptJoinAll.StudentProgram.fkCohortID = 49
OR qryRptJoinAll.StudentProgram.fkCohortID = 215
OR qryRptJoinAll.StudentProgram.fkCohortID = 220
GROUP BY StudentProgram.fkCohortID, Student.pkStudentID, Student.EmplID, Student.LastName,
Student.FirstName, PostBaccActivity.fkSemesterID, PostBaccActivity.fkPostBaccID,
PostBaccActivity.fkGradSchoolID, PostBaccActivity.ProjectTitle, PostBaccActivity.fkFacultyID,
PostBaccActivity.BeginDate, PostBaccActivity.EndDate, PostBaccActivity.Status
This is the query I am using to generate the other queries:
SELECT
Student.pkStudentID, Student.EmplID, Student.OldID, Student.Inactive,
Student.InactiveReason, Student.Status, Student.LastName, Student.MarriedName,
Student.FirstName, Student.MiddleName, Student.DOB, Student.Sex, Student.SSN,
Student.Email, Student.Race, Student.Ethnicity, Student.EmailSecondary,
Student.fkSemesterBCStart, Student.fkSemesterGrad, Student.TotalCredits,
Student.CreditsAttempted, Student.IndexCredits, Student.QualityPoints,
Student.LocalCredits, Student.TransferCredits, Student.OtherCredits,
StudentProgram.*, StudentEvent.*, StudentResearch.*, StudentEmployment.*,
StudentMajor.*, PostBaccActivity.*, StudentPresentation.*, Presentation.*,
StudentNote.*, SemesterGPA.*, StudentPublication.*, Publication.*, StudentCourse.fkCourseID,
StudentCourse.fkFacultyID, StudentCourse.fkSemesterID, StudentCourse.Grade, Grade.GradeValue,
Grade.NoValue, Course.*, RptCumulativeScienceGPA2.ScienceGPACalc, RptStudentControlList.pkStudentControlID
FROM
((((((((PostBaccActivity RIGHT JOIN ((((((StudentProgram RIGHT JOIN
Student ON StudentProgram.fkStudentID = Student.pkStudentID)
LEFT JOIN Cohort ON StudentProgram.fkCohortID = Cohort.pkCohortID) LEFT JOIN StudentEmployment ON Student.pkStudentID = StudentEmployment.fkStudentID)
LEFT JOIN StudentNote ON Student.pkStudentID = StudentNote.fkStudentID) LEFT JOIN StudentPresentation ON Student.pkStudentID = StudentPresentation.fkStudentID)
LEFT JOIN Presentation ON StudentPresentation.fkPresentationID = Presentation.pkPresentationID) ON PostBaccActivity.fkStudentID = Student.pkStudentID)
LEFT JOIN RptCumulativeScienceGPA2 ON Student.pkStudentID = RptCumulativeScienceGPA2.fkStudentID)
LEFT JOIN RptStudentControlList ON Student.pkStudentID = RptStudentControlList.fkControlID)
LEFT JOIN (StudentPublication LEFT JOIN Publication ON StudentPublication.fkPublicationID = Publication.pkPubID) ON Student.pkStudentID = StudentPublication.fkStudentID)
LEFT JOIN StudentResearch ON Student.pkStudentID = StudentResearch.fkStudentID) LEFT JOIN StudentMajor ON Student.pkStudentID = StudentMajor.fkStudentID)
LEFT JOIN StudentEvent ON Student.pkStudentID = StudentEvent.fkStudentID)
LEFT JOIN (Course RIGHT JOIN (Grade RIGHT JOIN StudentCourse ON Grade.Grade = StudentCourse.Grade) ON Course.pkCourseID = StudentCourse.fkCourseID) ON Student.pkStudentID = StudentCourse.fkStudentID)
LEFT JOIN SemesterGPA ON Student.pkStudentID = SemesterGPA.fkStudentID;
Is there anyway in which I can reduce the time it takes for them to run?
Without knowing the exact queries, can give only generic advice:
add indices to relevant fields in the tables, these depend on the queries
if the same query is repeated, cache the results and reuse them
joins of large tables might be optimized, by pre-filtering and caching
if possible, implement asynchronous queries which can return partial results while still running in the background; this can give the illusion of a faster response
Since you're not using any aggregate functions, get rid of the GROUP BY clause.
Make sure there is an index on StudentProgram.fkCohortID
But my guess is that the actual complexity is in qryRptJoinAll, so you'd have to show us this query too.

oracle grammar to h2 grammar (+) join table

I have the following query as Oracle
SELECT DISTINCT count(pa.payment_id) FROM
location c, inventory e,
inventory_stock es, payment_client ep,
payment pa, currency cur,
location s, exchange_country exc,
exchange_rate sso,
exchange_hike so,
exchange_margin sov WHERE
cur.outState = 'N' AND
c.location_id = e.location_id AND
e.inventory_id = ep.inventory_id AND
e.inventory_stock_id = es.inventory_stock_id AND
ep.client_id = pa.end_client AND
pa.cur_id = cur.cur_id AND
cur.location_id = s.location_id AND
c.client_id is not null AND
cur.cur_id = exc.cur_id(+) AND
exc.exchange_id = sso.exchange_id(+) AND
sso.account_id = so.account_id(+) AND
so.option_name(+) = 'PREMIUM' AND
exc.exchange_id = sov.exchange_id(+) AND
sov.name(+) = 'VALUE';
Right now I am using H2 database and the syntax error I got was from so.option_name(+) and sov.name(+); I know the (+) are oracle's way of right join and left join but are there any possible way to convert this into h2 so the error and the grammar are equivalent?
It's time to move on. Oracle's legacy outer join syntax is no longer recommended by Oracle. From the docs:
Oracle recommends that you use the FROM clause OUTER JOIN syntax rather than the Oracle join operator. Outer join queries that use the Oracle join operator (+) are subject to the following rules and restrictions, which do not apply to the FROM clause OUTER JOIN syntax [...]
If you replace (+) usage by outer join, not only will your query work on both Oracle and H2, it will also be an important step forward for your application as a whole.
SELECT DISTINCT count(pa.payment_id)
FROM location c
JOIN inventory e ON c.location_id = e.location_id
JOIN payment_client ep ON e.inventory_id = ep.inventory_id
JOIN inventory_stock es ON e.inventory_stock_id = es.inventory_stock_id
JOIN payment pa ON ep.client_id = pa.end_client
JOIN currency cur ON pa.cur_id = cur.cur_id
JOIN location s ON cur.location_id = s.location_id
LEFT JOIN exchange_country exc ON cur.cur_id = exc.cur_id
LEFT JOIN exchange_rate sso ON exc.exchange_id = sso.exchange_id
LEFT JOIN exchange_hike so
ON sso.account_id = so.account_id
AND so.option_name = 'PREMIUM'
LEFT JOIN exchange_margin sov
ON exc.exchange_id = sov.exchange_id
AND sov.name = 'VALUE'
WHERE c.client_id IS NOT NULL
AND cur.outState = 'N'
The importance when converting from (+) to LEFT JOIN is that you pay close attention which predicates must go into an ON clause, and which predicates are fine in the WHERE clause. In particular, the following two predicates must go in the relevant left joined table's ON clause:
so.option_name(+) = 'PREMIUM'
sov.name(+) = 'VALUE'
Third party tooling
You can use jOOQ's online SQL translator to translate between the syntaxes, or use jOOQ directly to translate from table lists with Oracle joins to ansi joins.
Disclaimer: I work for the company behind jOOQ

Database Design Relational Algebra query

I have this schema:
Suppliers(sid: integer, sname: string, address: string)
Parts(pid: integer, pname: string, color: string)
Catalog(sid: integer, pid: integer, cost: real)
And this task:
Find the sids of suppliers who supply every part.
What I don't understand is why in this solution we don't work with a negation. I was tempted to put C1.pid <> P.pid instead of C1.pid = P.pid in the end. Can someone explain?
SELECT C.sid
FROM Catalog C
WHERE NOT EXISTS (SELECT P.pid
FROM Parts P
WHERE NOT EXISTS (SELECT C1.sid
FROM Catalog C1
WHERE C1.sid = C.sid
AND C1.pid = P.pid))
Let's say you have 2 parts and 1 supplier. The supplier has both parts. If you join on <>, your innermost subquery will get two rows back: one for the Catalog entry for Part #1 (because Part #1 <> Part #2 is true); and one for the Catalog entry for Part #2 (likewise).
Your reasoning isn't entirely off, but the way to do that is not to use an inequality, but rather to use an outer join and test for the missing record on the "outer" table:
SELECT c.sid
FROM catalog c
WHERE NOT EXISTS
(SELECT c1.sid
FROM catalog c1 LEFT JOIN parts p ON c1.pid = p.pid
WHERE c.sid = c1.sid AND p.pid IS NULL)
Personally, I find the nested not exists to be a little confusing and needlessly complex. I would be more likely to solve this problem using count:
SELECT c.sid
FROM catalog c
GROUP BY c.sid
HAVING COUNT (DISTINCT c.pid) = (SELECT COUNT (*) FROM parts)

Resources