I'm in the process of converting some MS Access queries into Transact-SQL format and have run into some problems. Is there a way to write a Join within a Join?
For example:
LEFT JOIN (TaxInfo RIGHT JOIN TaxInfoJackpot
ON TaxInfo.RefNumber = TaxInfoJackpot.RefNumber)
ON HandPay.SlipNumber = TaxInfoJackpot.SlipNumber
This is just a snapshot of a much larger query of course. But, if anyone knows if this is possible any help would be great.
Thanks in advance.
I tend to like all of my joins to be sequential and flowing in the same direction, when possible (and I try to always re-order things so it is possible). LEFT JOIN / RIGHT JOIN / ON / ON is very confusing to follow for anyone, myself included, and I've been doing this for a very long time. Access certainly doesn't do anyone any favors with the bizarre syntax it pumps out (and accepts).
I am not sure if the current syntax provides the results you expect, but can you compare to this format to see if they're the same? Hard to know for sure without sample data and desired results.
SELECT ...
FROM dbo.TaxInfoJackPot AS jp
LEFT OUTER JOIN dbo.HandPay AS hp
ON hp.SlipNumber = jp.SlipNumber
LEFT OUTER JOIN dbo.TaxInfo AS ti
ON jp.RefNumber = ti.RefNumber;
You can do this with a subquery.
LEFT JOIN (
SELECT *
FROM TaxInfo ti
RIGHT JOIN TaxInfoJackpot j ON ti.RefNumber = j.RefNumber
) tij ON HandPay.SlipNumber = tij.SlipNumber
But I'm not sure if you actually need to do it this way. I think you can do this with just normal joins
FROM HandPay h
RIGHT JOIN TaxInfoJackpot j ON h.SlipNumber = j.SlipNumber
LEFT JOIN TaxInfo ti ON j.RefNumber = ti.RefNumber;
Related
I am trying to grasp SQL joins more intuitively. For example, learning how a RIGHT JOIN can just be re-written as a LEFT JOIN (by flipping the order of the tables) helped me understand much better the way that the two joins work.
However, now I'm wondering if an INNER JOIN could be re-written as a LEFT JOIN with a WHERE condition- meaning that their logic could be equivalent (by "logic" I do not mean the execution plan, but the way that the intended result set would be described).
Like:
SELECT * FROM HeaderTable
INNER JOIN DetailTable
ON HeaderTable.ID = DetailTable.ParentID
Which I would read as "Show me all the records from tables HeaderTable and DetailTable that have a matching value in the HeaderTable.ID and DetailTable.ParentID fields." Being the same as:
SELECT * FROM HeaderTable
LEFT JOIN DetailTable
ON HeaderTable.ID = DetailTable.ParentID
WHERE HeaderTable.ID = DetailTable.ParentID
Which I would read as "Show me all the records from tables HeaderTable and DetailTable where the value of HeaderTable.ID is the same as the value of DetailTable.ParentID."
Will these return the same result set? I am more asking about the logic being the same as opposed to one being more efficient than the other.
If I may ask, please don't answer with any Venn diagrams as these don't seem to describe the logic of a join exactly to me.
Yes, they will return the same result. The left join without the where clause would read as show me all the records from the header table and the related items from the details table or null for the details where there are no matches.
Adding a where clause relating the ids effectively transforms the left join to an inner join by eliminating the non-matching rows that would have shown up as having null for the detail part.
In some databases, like MS SQL Server, the left join would show up as an inner join in the query execution plan.
Although you stated that you don't want Venn diagrams I can't help referring you to this question and its answers even though they are filled with (in my opinion very helpful) Venn diagrams.
Yes they would return the same result.
But then you could simply write
SELECT *
FROM HeaderTable, DetailTable
WHERE HeaderTable.ID = DetailTable.ParentID
this returns the same result as well. This is an old syntax used before the join-clauses were introduced.
On a left join if you reference the left in the where then you negate the left and turn it into regular join
I want to join a table to one of two possible tables, depending on data. Here's an attempt that did not work, but gets the idea across, I hope. Also, this is a mocked up example that may not be very realistic, so don't get too hung up on the idea this is representing real students and classes.
SELECT *
FROM
student
INNER JOIN class ON class.student_id = student.student_id
CASE
WHEN class.complete=0
THEN RIGHT OUTER JOIN report ON report.label_id = inprogress.class_id
WHEN class.complete=1
THEN RIGHT OUTER JOIN report ON report.label_id = completed.class_id
END
Any ideas?
You have two join conditions and if either are true you want to commit a join - That's a boolean OR operation.
You simply need to:
RIGHT OUTER JOIN report ON (CONDITION1) OR (CONDITION2)
Let's unravel that a moment though, what is condition 1 and what is condition 2?
WHEN class.complete=0
THEN RIGHT OUTER JOIN report ON report.label_id = inprogress.class_id
WHEN class.complete=1
THEN RIGHT OUTER JOIN report ON report.label_id = completed.class_id
Here you're putting together two conditions on each of your condition 1 and 2, so your condition 1 is:
class.complete = 0 AND report.label_id = inprogress.class_id
and your condition 2 is
class.complete = 1 AND report.label_id = completed.class_id
So the completed SQL should be something like (and this is untested off the top of my head)
RIGHT OUTER JOIN report ON (
class.complete = 0 AND report.label_id = inprogress.class_id
) OR (
class.complete = 1 AND report.label_id = completed.class_id
)
Worth mentioning..
I haven't run the above join but I know from experience the execution plan on that will be absolutely abominable, won't matter if performance isn't important and or your data set is small, but if the performance matters I strongly suggest you post a broader scope of what you want here and we can talk about a better approach to getting your particular data set that won't perform so terribly. I would personally write a join like above only as a last resort or if I was hacking something truly irrelevant.
try this (untested) -
SELECT *
FROM student S
INNER JOIN class c ON c.student_id = S.student_id
left outer join report ir on ir.label_id = inprogress.class_id AND c.complete=0
left outer join report cr on cr.label_id = completed.class_id AND c.complete=1
If you want to join to either of 2 tables with reasonable performance, write a stored procedure for each path (one that joins table 1 to table A, one that joins table 1 to table B). Make a third stored procedure that calls either the 1-A stored procedure or the 1-B stored procedure. This way an efficient query plan will be performed in each case without having to make it recompile on each call or generate a strange query plan.
In your case, you actually want some records from both of the tables you might join to. In that case, you want to union the results together rather than pick one or the other (and you can combine them in one sproc if you want to, it shouldn't hurt the query plan). If you are sure the records won't duplicate between the two queries (it seems like they wouldn't), then as usual use UNION ALL for performance.
I have been trying to test this, but I have doubts about my tests as the timings vary so much.
-- Scenario 1
SELECT * FROM Foo f
INNER JOIN Bar b ON f.id = b.id
WHERE b.flag = true;
-- Scenario 2
SELECT * FROM Foo f
INNER JOIN Bar b ON b.flag = true AND f.id = b.id;
Logically it seems like scenario 2 would be more efficient, but I wasn't sure if SQL server is smart enough to optimize this or not.
Not sure why you think scenario 2 would "logically" be more efficient. On an INNER JOIN everything is basically a filter so SQL Server can collapse the logic to the exact same underlying plan shape. Here's an example from AdventureWorks2012 (click to enlarge):
I prefer separating the join criteria from the filter criteria, so will always write the query in the format on the left. However #HLGEM makes a good point, these clauses are interchangeable in this case only because it's an INNER JOIN. For an OUTER JOIN, it is very important to place the filters on the outer table in the join criteria, else you unwittingly end up with an INNER JOIN and drastically change the semantics of the query. So my advice about how the plan can be collapsed only holds true for inner joins.
If you're worried about performance, I'd start by getting rid of SELECT * and only pulling the columns you actually need (and make sure there's a covering index).
Four months later, another answer has emerged claiming that there usually will be a difference in performance, and that putting filter criteria in the ON clause will be better. While I won't dispute that it is certainly plausible that this could happen, I contend that it certainly isn't the norm and shouldn't be something you use as an excuse to always put all filter criteria in the ON clause.
The accepted answer is correct only for your test case.
An answer to the headline question as stated is yes, moving the constraint to the join condition can greatly improve the query and ensures. I have seen forms similar to this (but perhaps not exactly)...
select *
from A
inner join B
on B.id = a.id
inner join C
on C.id = A.id
where B.z = 1 and C.z = 2;
...not optimize to the same plan as the "on join" equivalents so I tend to use the "on join" constraints as a best practice even for the simpler cases that might have resolved optimally either way.
I have two select statements that I am trying to port from Oracle to Postgres:
1) (Note: this is a subselect part of a bigger select)
SELECT 'Y'
FROM CRAFT MA, CONFIG MAC, ARMS SM
WHERE MCI.MS_INVENTORY_NUMBER = SM.MS_INVENTORY_NUMBER (+)
AND MCI.AB_BASE_ID = MA.AB_BASE_ID_LAUNCH AND SM.ACT_AC_TYPE = MAC.ACT_AC_TYPE
AND SM.MAC_ID = MAC.MAC_ID AND MAC.ACT_AC_TYPE = MA.ACT_AC_TYPE
AND MAC.MAC_ID = MA.MAC_ID_PRI
2)
SELECT ASP.ASP_SPACE_NM,
SUM(MO.MO_TKR_TOTAL_OFF_SCHEDULED) AS "TOTAL_PLANNED"
FROM MISSION_OBJECTIVE MO, SPACE ASP
WHERE ASP.ASP_SPACE_NM = MO.ASP_SPACE_NM (+)
AND MO.MO_MSN_CLASS_NM = 'TOP'
GROUP BY ASP.ASP_SPACE_NM
The (+) syntax is confusing for me... I know it signifies a "join", but I am not familiar enough with SQL to understand what is equivalent to what.
SELECT 'Y'
FROM CRAFT MA
JOIN CONFIG MAC
ON MAC.ACT_AC_TYPE = MA.ACT_AC_TYPE
AND MAC.MAC_ID = MA.MAC_ID_PRI
AND MA.AB_BASE_ID_LAUNCH = MCI.AB_BASE_ID
LEFT JOIN
ARMS SM
ON SM.MS_INVENTORY_NUMBER = MCI.MS_INVENTORY_NUMBER
WHERE SM.ACT_AC_TYPE = MAC.ACT_AC_TYPE
AND SM.MAC_ID = MAC.MAC_ID AND
and
SELECT ASP.ASP_SPACE_NM,
SUM(MO.MO_TKR_TOTAL_OFF_SCHEDULED) AS "TOTAL_PLANNED"
FROM SPACE ASP
LEFT JOIN
MISSION_OBJECTIVE MO
ON MO.ASP_SPACE_NM = ASP.ASP_SPACE_NM
WHERE MO.MO_MSN_CLASS_NM = 'TOP'
GROUP BY
ASP.ASP_SPACE_NM
I left the LEFT JOIN as it was in the original query, but it's redundant here due to the WHERE condition.
You may replace it with an INNER JOIN or just drop the (+) part.
(+) was Oracle's way of expressing an outer join before "LEFT JOIN" (etc.) was added to standard SQL. Some Oracle practitioners did not immediately switch over to the standard syntax, even after it was added to the Oracle dialect of SQL.
The answer you have accepted says that you can replace "LEFT JOIN" with "INNER JOIN". I'm not convinced. But I'm confused by the reference to "MCI" in the first query.
The original programmer probably had a reason for using outer join instead of inner join. You might want to check and see whether that original reason was valid.
When I create a view in SQL Server 2005 and someone later opens it in the GUI modify mode it sometimes completely rearranges my joins. Often to the point it's virtually no longer readable. If I look at it in the GUI it's changed as well from what I originally wrote. Sometimes the join lines no longer even point to a field and the join symbol has "fx". I assume it has "optimized" it for me. Is there a way to prevent this from happening? And what does "fx" mean?
Originally it was something like this:
FROM dbo.Stop
LEFT OUTER JOIN dbo.StopType ON dbo.Stop.StopTypeID = dbo.StopType.StopTypeID
LEFT OUTER JOIN dbo.CityState ON dbo.Stop.City = dbo.CityState.City AND dbo.Stop.State = dbo.CityState.State
LEFT OUTER JOIN dbo.vwDrivers ON dbo.Stop.DriverID = dbo.vwDrivers.DriverID
LEFT OUTER JOIN dbo.truck ON dbo.Truck.TruckID = dbo.Stop.TruckID
INNER JOIN dbo.vwTicketIDFirstStopLastStop ON dbo.vwTicketIDFirstStopLastStop.TicketID = dbo.stop.ticketid
LEFT OUTER JOIN dbo.Company ON dbo.Company.CompanyID = dbo.vwTicketIDFirstStopLastStop.BillToCompanyID
Now it's this.
FROM dbo.Truck RIGHT OUTER JOIN
dbo.Stop INNER JOIN
dbo.StopType ON dbo.Stop.StopTypeID = dbo.StopType.StopTypeID LEFT OUTER JOIN
dbo.CityState ON dbo.Stop.City = dbo.CityState.City AND dbo.Stop.State = dbo.CityState.State LEFT OUTER JOIN
dbo.vwDrivers ON dbo.Stop.DriverID = dbo.vwDrivers.DriverID ON dbo.Truck.TruckID = dbo.Stop.TruckID LEFT OUTER JOIN
dbo.vwTicketIDFirstStopLastStop LEFT OUTER JOIN
dbo.Company ON dbo.Company.CompanyID = dbo.vwTicketIDFirstStopLastStop.BillToCompanyID ON dbo.vwTicketIDFirstStopLastStop.TicketID = dbostop.ticketid
No you cannot. That's why you should never, ever use it.
"Fx" means that the join is not a simple column-to-column link, but involves a function (that's also why it can't point to a field, of course). It should NOT do that by itself, though,
I hate the view GUI
I use RightClick->ScriptViewAs->Alter To->New Query Editor Window
So much nicer :)
It's the result of the view GUI parsing the SQL into it's own internal DOM style format and then writing it back out - much in the same way that HTML editors take in XHTML and emit stuff differently to how you wanted it :(