Does this query working like cross join? - sql-server

I am trying to run this query but couldn't understand its working process.
SELECT *
FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.ID = T2.ID
LEFT JOIN TABLE1 T3 ON T1.ID = T2.ID
table1 contains 3 records with sequential id 1, 2, 3 and table2 contains 4 records with sequential id 1, 2, 3, 4
and one another thing i also want to know that is, does this query executing from right to left? i mean left join process executes first then inner join? i am saying this according to query execution plan.

Here is your current query:
SELECT *
FROM table1 t1
INNER JOIN table2 t2
ON t1.id = t2.id
LEFT JOIN table1 t3
ON t1.id = t2.id
The first join is just a normal inner join between table1 and table2. The second join is a left join, but the ON condition is superfluous, and I believe the behavior would be the same without this condition even being there. The reason for this is that the join condition t1.id = t2.id will already always be true at that point in the query, for every record in the intermediate table. Hence, it appears that the second join would effectively be a cross join with table1.
Typically, your join condition will involve the two tables being joined.

Yes it's working like a cross join. If this is the intention you should rewrite it correctly. What you have there is misleading and confusing (where did it come from originally?)
select * from table1 t1
inner join table2 t2
on t1.id=t2.id
cross join table1 t3
The order that tables, filters, joins are evalulated are dictated by the query plan (press CTRL-L). This may change at any time. You shouldn't be concerned about the ordered these run in - you just need to know that you will get the same results no matter how it is executed. The query planner might choose one method over the other if it thinks it will be faster

Related

SQL get counts using subqueries from multiple linked tables

Suppose I have tables 1-4, all the other tables are linked to table1. For what its worth, table1, table2 and table3 are relatively small but table4 contains a lot of data.
Now I have the following query:
SELECT t1.id
, (SELECT COUNT(*) FROM table2 WHERE table1_id = t1.id) AS t2_count
, (SELECT COUNT(*) FROM table3 WHERE table1_id = t1.id) AS t3_count
, (SELECT COUNT(*) FROM table4 WHERE table1_id = t1.id) AS t4_count
FROM table1 t1
Due to the fact that the subqueries are dependent/correlated I assumed that there must be a better way (performance wise) to get the data.
I tried to do the following but it drastically increased the execution time (from about 2s to 35s). I'm guessing that the multiple left joins creates a very big data set?!
SELECT t1.id
, COUNT(t2.id) AS t2_count
, COUNT(t3.id) AS t3_count
, COUNT(t4.id) AS t4_count
FROM table1 t1
LEFT JOIN table2 t2 ON t2.table1_id = t1.id
LEFT JOIN table3 t3 ON t3.table1_id = t1.id
LEFT JOIN table4 t4 ON t4.table1_id = t1.id
GROUP BY t1.id
Is there better way to get the counts? I don't need the data from the other tables.
UPDATE:
Bart's answer got me thinking that the table1_id columns are nullable. I added a IS NOT NULL check to the WHERE clauses and this brought the time down to 1s.
SELECT t1.id
, (SELECT COUNT(*) FROM table2 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t2_count
, (SELECT COUNT(*) FROM table3 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t3_count
, (SELECT COUNT(*) FROM table4 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t4_count
FROM table1 t1
I guess not. If you execute a SELECT COUNT(*) FROM [table], it should perform a count on the table's PK. That should be pretty fast, even for very large tables.
Is your table4 a real table (and not a view, or a table-valued function, or something else that looks like a table)? And does it have a primary key? If so, I don't think that the performance of a SELECT COUNT(*) FROM [table4] query can be increased significantly.
It may also be the case, that your table4 is heavily targeted (in concurrent transactions over multiple connections), or perhaps your SQL Server is doing some heavy IO or computations. I cannot assume anything about that, however. You may try to check if your query is also slow on a restored database backup on a physically separate test server.

Custom order for multiple joins in SQL Server

I am trying to write a query in SQL Server that replicates following figure:
I want the result of first left join (order_defect & ncdef) to be left join with third table (filter) and again the result of these three left join with last one (nsdic).
Each of these tables are huge, so I'm trying to find most efficient way to do it because i have limited space and I get "out of memory" error... any suggestion for an efficient query?
If I do:
Select *
from A
left join B on a.id = B.id
left join C on a.id = c.id
it's joining A and B first and then A and C...but I want the result of "A & B" to be join with "C".
Basically my question is how to use a result of one join, to join with another table.
Thank you
select
c.id
,c.colum1
,c.colum2
,c.colum3
,c.colum4
,t3.colum1
from
(
select
t1.id as id
,t1.colum1 as colum1
,t1.colum2 as colum2
,t2.column1 as colum3
,t2.colum2 as colum4
from table1 t1
left join table2 t2
on t1.id = t2.id
) as c
left join table3 t3
on c.id = t3.id
It's dificult to help you without the tables design and fields|columns, keys, ...
But I'll considerate:
1 - Primary keys fields, and how to relation the tables
2 - How to add left joins with "filters", or how to reduce the number of results
3 - Evaluate if it'll be better to use Sub-querys
Plus: Try the query with TOP 100 <--- to make test.
And remember: sometimes it's imposible to optimizate querys because of the hardware limits, like the RAM, in those case you have to show the data in sections.

Is the order of multiple INNER JOIN-s related with the tables relashionship?

I want to join some tables using INNER JOIN statement.Some of them have the many to many relashionship ,some one to many and some one to one. I want to know does the order of INNER JOIN-s statement matters and is it related with the type of relashionship(One to one,one to many etc.)? So does these three codes below output the same result?
SELECT ....
FROM table1
INNER JOIN (table2 INNER JOIN table3 ON table2.col=table3.col)
ON table1.col=table2.col
SELECT ....
FROM table1
INNER JOIN (table2 INNER JOIN table3 ON table2.col=table3.col)
ON table1.col=table3.col
SELECT ....
FROM table2
INNER JOIN (table1 INNER JOIN table3 ON table1.col=table3.col)
ON table3.col=table2.col
And can I replace the INNER JOIN of two tables with this code below?So does this code below represents the inner join of table 1 and table2?
SELECT ...
FROM table1,table2
WHERE (table1.col=table2.col)
Exactly order of joins is not matter.
Better to use
select ...
from table1
inner join table2 on table2.col=table1.col
inner join table3 on table3.col=table1.col
Yes, INNER JOIN's could be replaced with
WHERE t1.col=t2.col
And SQL plan will be the same.
But if there are other filters in where condition - will mix.
Also, if there is additional join conditions - better to filter out all not required records first.
It makes absolutely no difference to the results. Because you are using only inner joins, only the matches in all three tables will show.
If you were to use a LEFT OUTER join and an INNER join in one query, you could vary the resultsets.
For INNER JOIN it will give same result but with other type of join it will give different result.
for example:
SELECT ....
FROM table1
LEFT JOIN (table2 INNER JOIN table3 ON table2.col=table3.col)
ON table1.col=table2.col
above is equivalent to
SELECT ....
FROM table1
LEFT JOIN (
Select table2.col
From table2 INNER JOIN table3 ON table2.col=table3.col
) tbl
ON table1.col=tbl.col
First it will do INNER JOIN of table2 & table3 then the table1 will left joined with the result

SQL Server speed: left outer join vs inner join

In theory, why would inner join work remarkably faster then left outer join given the fact that both queries return same result set. I had a query which would take long time to describe, but this is what I saw changing single join: left outer join - 6 sec, inner join - 0 sec (the rest of the query is the same). Result set: the same
Actually depending on the data, left outer join and inner join would not return the same results..most likely left outer join will have more result and again depends on the data..
I'd be worried if I changed a left join to an inner join and the results were not different. I would suspect that you have a condition on the left side of the table in the where clause effectively (and probably incorrectly) turning it into an inner join.
Something like:
select *
from table1 t1
left join table2 t2 on t1.myid = t2.myid
where t2.somefield = 'something'
Which is not the same thing as
select *
from table1 t1
left join table2 t2
on t1.myid = t2.myid and t2.somefield = 'something'
So first I would be worried that my query was incorrect to begin with, then I would worry about performance. An inner join is NOT a performance enhancement for a Left Join, they mean two different things and should return different results unless you have a table where there will always be a match for every record. In this case you change to an inner join because the other is incorrect not to improve performance.
My best guess as to the reason the left join takes longer is that it is joining to many more rows that then get filtered out by the where clause. But that is just a wild guess. To know you need to look at the Execution plans.

CROSS APPLY Performance

Is it possible to improve performance by taking the following SQL:
SELECT t1.id,
t1.name,
t2.subname,
t2.refvalue
FROM table1 AS t1
CROSS apply (SELECT TOP 1 t2.subid,
t2.subname,
t3.refvalue
FROM table2 AS t2
INNER JOIN table3 AS t3
ON t2.subid = t3.subid
ORDER BY lastupdated DESC) AS t2
And rewriting it so that it looks like this:
SELECT t1.id,
t1.name,
t2.subname,
t3.refvalue
FROM table1 AS t1
CROSS apply (SELECT TOP 1 t2.subid,
t2.subname
FROM table2 AS t2
ORDER BY lastupdated DESC) AS t2
INNER JOIN table3 AS t3
ON t2.subid = t3.subid
Firstly, does it give the same result?
If so, what does the query plan say, and also set statistics io on?
How many rows in Table1, Table2 and Table3? How many intersect and end up in the result? I'm trying to figure out the purpose of rewriting the query, and agree with gbn... do you get the same result, does the query plan look the same in both cases, do the statistics i/o get any better, and does the rewritten query run any faster?

Resources