CROSS APPLY Performance - sql-server

Is it possible to improve performance by taking the following SQL:
SELECT t1.id,
t1.name,
t2.subname,
t2.refvalue
FROM table1 AS t1
CROSS apply (SELECT TOP 1 t2.subid,
t2.subname,
t3.refvalue
FROM table2 AS t2
INNER JOIN table3 AS t3
ON t2.subid = t3.subid
ORDER BY lastupdated DESC) AS t2
And rewriting it so that it looks like this:
SELECT t1.id,
t1.name,
t2.subname,
t3.refvalue
FROM table1 AS t1
CROSS apply (SELECT TOP 1 t2.subid,
t2.subname
FROM table2 AS t2
ORDER BY lastupdated DESC) AS t2
INNER JOIN table3 AS t3
ON t2.subid = t3.subid

Firstly, does it give the same result?
If so, what does the query plan say, and also set statistics io on?

How many rows in Table1, Table2 and Table3? How many intersect and end up in the result? I'm trying to figure out the purpose of rewriting the query, and agree with gbn... do you get the same result, does the query plan look the same in both cases, do the statistics i/o get any better, and does the rewritten query run any faster?

Related

SQL get counts using subqueries from multiple linked tables

Suppose I have tables 1-4, all the other tables are linked to table1. For what its worth, table1, table2 and table3 are relatively small but table4 contains a lot of data.
Now I have the following query:
SELECT t1.id
, (SELECT COUNT(*) FROM table2 WHERE table1_id = t1.id) AS t2_count
, (SELECT COUNT(*) FROM table3 WHERE table1_id = t1.id) AS t3_count
, (SELECT COUNT(*) FROM table4 WHERE table1_id = t1.id) AS t4_count
FROM table1 t1
Due to the fact that the subqueries are dependent/correlated I assumed that there must be a better way (performance wise) to get the data.
I tried to do the following but it drastically increased the execution time (from about 2s to 35s). I'm guessing that the multiple left joins creates a very big data set?!
SELECT t1.id
, COUNT(t2.id) AS t2_count
, COUNT(t3.id) AS t3_count
, COUNT(t4.id) AS t4_count
FROM table1 t1
LEFT JOIN table2 t2 ON t2.table1_id = t1.id
LEFT JOIN table3 t3 ON t3.table1_id = t1.id
LEFT JOIN table4 t4 ON t4.table1_id = t1.id
GROUP BY t1.id
Is there better way to get the counts? I don't need the data from the other tables.
UPDATE:
Bart's answer got me thinking that the table1_id columns are nullable. I added a IS NOT NULL check to the WHERE clauses and this brought the time down to 1s.
SELECT t1.id
, (SELECT COUNT(*) FROM table2 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t2_count
, (SELECT COUNT(*) FROM table3 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t3_count
, (SELECT COUNT(*) FROM table4 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t4_count
FROM table1 t1
I guess not. If you execute a SELECT COUNT(*) FROM [table], it should perform a count on the table's PK. That should be pretty fast, even for very large tables.
Is your table4 a real table (and not a view, or a table-valued function, or something else that looks like a table)? And does it have a primary key? If so, I don't think that the performance of a SELECT COUNT(*) FROM [table4] query can be increased significantly.
It may also be the case, that your table4 is heavily targeted (in concurrent transactions over multiple connections), or perhaps your SQL Server is doing some heavy IO or computations. I cannot assume anything about that, however. You may try to check if your query is also slow on a restored database backup on a physically separate test server.

Does this query working like cross join?

I am trying to run this query but couldn't understand its working process.
SELECT *
FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.ID = T2.ID
LEFT JOIN TABLE1 T3 ON T1.ID = T2.ID
table1 contains 3 records with sequential id 1, 2, 3 and table2 contains 4 records with sequential id 1, 2, 3, 4
and one another thing i also want to know that is, does this query executing from right to left? i mean left join process executes first then inner join? i am saying this according to query execution plan.
Here is your current query:
SELECT *
FROM table1 t1
INNER JOIN table2 t2
ON t1.id = t2.id
LEFT JOIN table1 t3
ON t1.id = t2.id
The first join is just a normal inner join between table1 and table2. The second join is a left join, but the ON condition is superfluous, and I believe the behavior would be the same without this condition even being there. The reason for this is that the join condition t1.id = t2.id will already always be true at that point in the query, for every record in the intermediate table. Hence, it appears that the second join would effectively be a cross join with table1.
Typically, your join condition will involve the two tables being joined.
Yes it's working like a cross join. If this is the intention you should rewrite it correctly. What you have there is misleading and confusing (where did it come from originally?)
select * from table1 t1
inner join table2 t2
on t1.id=t2.id
cross join table1 t3
The order that tables, filters, joins are evalulated are dictated by the query plan (press CTRL-L). This may change at any time. You shouldn't be concerned about the ordered these run in - you just need to know that you will get the same results no matter how it is executed. The query planner might choose one method over the other if it thinks it will be faster

Convert T-SQL Cross Apply to Oracle

I'm looking to convert this SQL Server (T-SQL) query that uses a cross apply to Oracle 11g. Oracle does not support Cross Apply until 12g, so I have to find a work-around. The idea behind the query is for each Tab.Name that = 'Foobar', I need find the previous row's name with the same ID ordered by Tab.Date. (This table contains multiple rows for 1 ID with different Name and Date).
Here is the T-SQL code:
SELECT DISTINCT t1.ID
t1.Name,
t1.Date,
t2.Date as 'PreviousDate',
t2.Name as 'PreviousName'
FROM Tab t1
OUTER apply (SELECT TOP 1 t2.Date,
t2.Name
FROM Tab t2
WHERE t1.Id = t2.Id
ORDER BY t2.Date DESC) t2
WHERE t1.Name = 'Foobar' )
Technically, I was able to recreate this same functionality in Oracle using LEFT JOIN and LAG() function:
SELECT DISTINCT t1.ID
t1.Name,
t1.Date,
t2.PreviousDate as PreviousDate,
t2.PreviousName as PreviousName
FROM Tab t1
LEFT JOIN (
SELECT ID,
LAG(Name) OVER (PARTITION BY ID ORDER BY PreviousDate) as PreviousName,
LAG(Date) OVER (PARTITION BY ID ORDER BY PreviousDate) as PreviousDate
FROM Tab) t2 ON t2.ID = t1.ID
WHERE t1.Name = 'Foobar'
The issue is the order it executes the Oracle query. It will pull back ALL rows from Tab, order them (because of the LAG function), then it will filter them down using the ON statement when it joins it to the main query. That table has millions of records, so doing that for EACH ID is not feasible. Basically, I want to change the order of operations in the sub-query to just pull back rows for a single ID, sort those rows to find the previous, and join that. Any ideas on how to tweak it?
TL;DR
SQL Server: filters, orders, joins
Oracle: orders, filters, joins
You can look for the latest row per (id) group with row_number():
select *
from tab t1
left join
(
select row_number() over (
partition by id
order by Date desc) as rn
, *
from t2
) t2
on t1.id = t2.id
and t2.rn = 1 -- Latest row per id

The effect of a select with multiple tables in FROM is the same as INNER JOIN but what is the ON clause then?

I have a query like this:
SELECT
*
FROM
table1,
table2
I know this is somewhat equivalent to:
SELECT
*
FROM
table1
INNER JOIN
table2
ON ???
However, what would be the resulting ON clause for the join?
Update
After some testing in SSMS here are my findings
SELECT * FROM table1,table2
gives the same execution plan and the same records as
SELECT * FROM table1 INNER JOIN table2 ON 1=1
and the same thing for
SELECT * FROM table1 CROSS JOIN table2
the column that defines their relationship.
SELECT *
FROM table1
INNER JOIN table2
ON table1.ID = table2.ID
actually the query you have showed is not equal. The first one produces cartesian product of all the records on both table or in other words CROSS JOIN.
SELECT
*
FROM
table1,
table2
is equivalent to:
SELECT
*
FROM
table1
CROSS JOIN
table2
there is no ON statement with a CROSS JOIN. If you need to filter a CROSS JOIN, put it in the WHERE clause.
WHERE table1.DateCreated <= table2.DateModified
After some testing in SSMS here are my findings
SELECT * FROM table1,table2
gives the same execution plan and the same records as
SELECT * FROM table1 INNER JOIN table2 ON 1=1
and the same thing for
SELECT * FROM table1 CROSS JOIN table2

I want to replace the "left outer join" with an alternate basic query, how can I do it?

The query looks like this:
Select t1.*, t2.balance from t1 left outer join t2 on (t1.id1 = t2.id1 and t1.id2 = t2.id2)
where t1.name = 'name';
I was good until I was using native queries but now I need to use Hibernate's JPA implementation for all the queries. The involved table are not associated in any way.
That's why I want to use the alternate fundamental query equivalent to left outer join.
Thanks,
Mahesh
My only suggestion would be to UNION the results of two queries, the inner join and then the rows from t1 without a match in t2, something like:
Select t1.*, t2.balance from t1, t2 where t1.name = 'name' and t1.id1 = t2.id1 and t1.id2 = t2.id2
UNION
Select t1.*, null where t1.name = 'name' and (t1.id1,t1.id2) not in (select id1, id2 from t2)
;
I'm not familiar with Hibernate and so don't know if this gives you the same issue. I suppose in the worst case if you really can only execute basic queries then having two independent queries and combining the results in code may have to suffice.

Resources