Not in Not exists - sql-server

I have three tables
table1 -> xt1, yt1, zt1;
table2 -> xt2
table3 -> yt3, zt3
SELECT xt1, yt1, zt1
From table1, table3
Where xt1
NOT IN
(SELECT DISTINCT table1.xt1 FROM table2 INNER JOIN table1 ON
table1.xt1 = Replace(table2.xt2,',',''))
And table1.yt1 = table3.yt3
AND table1.zt1 = table3.zt3
it is working correctly but i take long time.
if i replace NOT IN with Not exists it return empty set.
SELECT xt1, yt1, zt1
From table1, table3
Where Not exists
(SELECT DISTINCT table1.xt1 FROM table2 INNER JOIN table1 ON
table1.xt1 = Replace(table2.xt2,',',''))
And table1.yt1 = table3.yt3
AND table1.zt1 = table3.zt3
the results of the second select should be 6 rows but it returns notiong with not exists.
also if i tried to change the compare part to
table1.xt1 != Replace(table2.xt2,',','') and remove the NOT IN
select it get outof memory error.
So is this the best way to write my query and why it return empty set with Not exists
thank you.

Ok, first of all, I changed your implicit join to an explicit one. Then I fixed the NOT EXISTS so it correlates to the outer table1:
SELECT t1.xt1, t1.yt1, t1.zt1
FROM table1 AS t1
INNER JOIN table3 AS t3
ON t1.yt1 = t3.yt3
AND t1.zt1 = t3.zt3
WHERE NOT EXISTS ( SELECT 1
FROM table2 AS t2
INNER JOIN table1 AS t1_1
ON t1_1.xt1 = REPLACE(t2.xt2,',','')
AND t1_1.xt1 = t1.xt1) ;
which can be simplified further to:
SELECT t1.xt1, t1.yt1, t1.zt1
FROM table1 AS t1
INNER JOIN table3 AS t3
ON t1.yt1 = t3.yt3
AND t1.zt1 = t3.zt3
WHERE NOT EXISTS ( SELECT 1
FROM table2 AS t2
WHERE t1.xt1 = REPLACE(t2.xt2,',','')
) ;

You need to select IN or exist depending upon the size of inner query. when there is a outer query and inner-sub-query, if the result of sub query is small, In is preferred as outer query is selected based upon result of sub-query.
if the result of sub-query is large, exist is preferred as outer query is evaluated first.

Related

MSSQL - Adding condition on the where and on clause performance

Consider the following two queries:
select *
from
table1 t1
left join
table2 t2
on t1.Id = t2.t1Id and (t1.Status = 1 or t2.Id is not null)
And this one
select *
from
table1 t1
left join
table2 t2
on t1.Id = t2.t1Id
where
t1.Status = 1 or t2.Id is not null
The first one runs in 2 seconds. The second one in 2 minutes. Shouldn't the execution plan be the same?
The query plans are different because the queries (and results) are different.
You're using a LEFT JOIN, so the first query will return rows with NULL values where not in table 2.
The second query will not return those rows.
If it was an INNER JOIN, they would essentially be the same query.
Here the Below Query Returns all the "Table1" results with additional matching Columns based on the "ON Clause" condition.
select * from table1 t1
left join table2 t2
on t1.Id = t2.t1Id and (t1.Status = 1 or t2.Id is not null)
Now, the below query matches the 2 tables and returns the rows based on the ON Clause and an additional WHERE Clause filters the Rows again based on the Condition.
select * from
table1 t1
left join table2 t2 on t1.Id = t2.t1Id
where t1.Status = 1 or t2.Id is not null
Here, Even though we used LEFT JOIN But in this case it acts like an INNER JOIN
So, Here Both the Queries produce Different Result Sets. The Execution Plan Also Vary which results in Different Execution Time.
The best way to deal with an OR is to eliminate it (if possible) or break it into smaller queries. Breaking a short and simple query into a longer, more drawn-out query may not seem elegant, but when dealing with OR problems, it is often the best choice:
select *
from table1 t1
left join table2 t2 t1.Id = t2.t1Id
where t1.Status = 1
union all
select *
from table1 t1
left join table2 t2 t1.Id = t2.t1Id
where t2.Id is not null
You can read more in this article:
https://www.sqlshack.com/query-optimization-techniques-in-sql-server-tips-and-tricks/

How to write an SQL query for this scenerio?

I'm trying to write a query where I want all the items in table1 that are in table2 which meets the first inner join criteria below (this is work find).
Then I want to check table3 to if there are exceptions. Exceptions are base on reference numbers (REF_NO). If the reference number exist in table3 then I need to check if the store number (STORE_NO) matches. If they match then I want the matching record from table1. If not then exclude the matching record from table1.
However, if there reference number DOES NOT exist in table3 then I want the record from table1 that match with table2.
Thanks
USE master
GO
table1
table2
table3
SELECT
T1.TERMINAL,
T1.OPERATOR,
T1.TRANS_NO,
T1.SEQ_NO,
T1.STORE_NO,
T2.REF_NO,
T2.SDATE,
T2.EDATE,
T1.POS_DATE,
T1.ITEM,
T1.ITYPE,
T1.SOLD_QTY,
T1.PRICE,
T2.OI_AMT
FROM [table1] As T1
INNER JOIN [table2] As T2
ON (T1.ITEM = T2.ITEM) And (T1.POS_DATE BETWEEN T2.SDATE And T2.EDATE)
INNER JOIN [table3] As T3
ON (T2.REF_NO = T3.REF_NO) And (T1.STORE_NO = T3.STORE)
SELECT
T1.TERMINAL,
T1.OPERATOR,
T1.TRANS_NO,
T1.SEQ_NO,
T1.STORE_NO,
T2.REF_NO,
T2.SDATE,
T2.EDATE,
T1.POS_DATE,
T1.ITEM,
T1.ITYPE,
T1.SOLD_QTY,
T1.PRICE,
T2.OI_AMT
FROM [table1] As T1
INNER JOIN [table2] As T2
ON T1.ITEM = T2.ITEM
AND T1.POS_DATE > T2.SDATE
AND T1.POS_DATE <= T2.EDATE
WHERE EXISTS ( SELECT TOP (1) 1
FROM table3 as T31
WHERE T2.REF_NO = T31.REF_NO
AND T1.STORE_NO = T31.STORE)
OR NOT EXISTS ( SELECT TOP (1) 1
FROM table3 as T32
WHERE T2.REF_NO = T32.REF_NO)
This should work, it checks both of your conditions. I also would encourage you not to use BETWEEN clause and specify date ranges using two conditions.

replace nested where condition with join

I have a SQL query that looks like this
Select a.*
From table1 a
where a.ColumnName in
(Select MAX(b.ColumnName)
from table2 b
where b.ColumnName2 in
(
Select MAX(c.columnName)
from table3 c
Group by c.ColumnName2
)
Group by b.ColumnName2
)
I am trying to write this in a join statement. I am positive inner join is what I need to get the right information. If someone could translate this to a join statement, I would be really glad.
Thank you.
EDIT 1:
I tried the typical Join statement that a rookie would.
Select a.*
from table1 a
inner join table2 b
on a.columnname = (Select max(b.columnName) from table2)
inner join table3 c
on b.columnName = (select max(c.columnName) from table3)
Obviously, that didn't work because I get 100,000+ results when I should be getting 800. I tried using an alias for table2 and table3 INSIDE the subselect statements and selecting the columnname using THAT alias like this:
Select max(bPart.columnName from table2 bPart)
Select max(cPart.columnName from table3 cPart)
Still the same result.
PERHAPS....
Though I'm not sure why a join is needed. Performance wise exists would likely be fastest, and since you're not returning values from table2 or 3 it seems like it would be the best approach.
SELECT a.*
FROM table1 a
INNER JOIN (SELECT MAX(ColumnName) MColumnName, columnname2
FROM table2
GROUP BY columnName2) B
ON A.columnName = B.mColumnName
INNER JOIN (SELECT MAX(columnName) mColumnName
FROM table3
GROUP BY ColumnName2) C
ON B.columname2 = C.MColumnName

SQL - Apply only one set of joins at a time

I have three joins in my query. My requirement is to select records if either first two joins get satisfied or the third join gets satisfied
select * from security.requestview r
-- either below two joins satisfy
left join security.RequestDelegateView rd on r.id = rd.RequestId
join (select top 1 PersonnelNumber from SECURITY.MyRolesView) m on (m.PersonnelNumber=r.RequestorId or m.PersonnelNumber=r.InitiatorId or m.PersonnelNumber=rd.DelegatePersonnelNumber)
-- or this join
join (select substring(ltrim(DDSUCode),0,3) as Division from security.staffview where PersonnelNumber = (select top 1 PersonnelNumber from SECURITY.MyRolesView)) s
on r.OrganizationUnitRefId like s.Division+'%'
But ofcourse it will try to satisfy all the joins together.
Is there any way I can put some condition where it will select record if either first two joins satisfy or the last join alone satisfies?
Update
I tried putting them as where conditions but then the query is running forever
To answer this question:
Is there any way I can put some condition where it will select record
if either first two joins satisfy or the last join alone satisfies?
No, I know of no way to do this in a single query.
One way you can write this so that the third join only gets executed if the first one doesn't return records is in a multi-query series that populates a temp table or table variable:
INSERT INTO #tmp
SELECT (the first join);
IF (SELECT COUNT(*) FROM #tmp) = 0
INSERT INTO #tmp
SELECT (the second join);
SELECT * FROM #tmp;
WITH FirstJoin AS (
select * from security.requestview r
left join security.RequestDelegateView rd on r.id = rd.RequestId
join (select top 1 PersonnelNumber from SECURITY.MyRolesView) m on (m.PersonnelNumber=r.RequestorId or m.PersonnelNumber=r.InitiatorId or m.PersonnelNumber=rd.DelegatePersonnelNumber)
), SecondJoin AS (
select * from security.requestview r
join (select substring(ltrim(DDSUCode),0,3) as Division from security.staffview where PersonnelNumber = (select top 1 PersonnelNumber from SECURITY.MyRolesView)) s
on r.OrganizationUnitRefId like s.Division+'%'
), BothJoins AS (
SELECT * FROM FirstJoin
UNION ALL
SELECT * FROM SecondJoin
)
SELECT DISTINCT * FROM BothJoins
To get this to work you need to Change the "Select *" inside the CTE's so that every returned column is named. (Since I don't know whats in your tables I couldn't do it).
Please note that FirstJoin and SecondJoin needs to return the same columns.
select *
from table1
left join table2
on table1.id = table2.t1
left join table3
on table1.id = table3.t1
left join table4
on table1.id = table4.t1
where table2.t1 is not null
or table3.t1 is not null
or table4.t1 is not null

Optimize CASE Test in SQL Server

I'm wondering if there's any way to optimize the following SELECT query. (Note: I typed this when writing my question for nonexistent tables and I might not have the correct syntax.)
The goal is, if Table2 contains any related rows I want to set the value of the third column to the number of related rows in Table2. Otherwise, if Table3 contains any related rows I want to set the column to the number of related rows in Table3. Otherwise, I want to set the column value to 0.
SELECT Id, Title,
CASE
WHEN EXISTS (SELECT * FROM Table2 t2 WHERE t2.RelatedId = Table1.Id) THEN
(SELECT COUNT(1) FROM Table2 t2 WHERE t2.RelatedId = Table1.Id)
WHEN EXISTS (SELECT * FROM Table3 t3 WHERE t3.RelatedId = Table1.Id) THEN
(SELECT COUNT(1) FROM Table3 t3 WHERE t3.RelatedId = Table1.Id)
ELSE 0
END AS RelatedCount
FROM Table1
I don't like the fact that I'm basically performing the same query twice (in two cases). Is there any way to do what I want while only performing the query once?
Note that this is part of a much larger query with multiple JOINs and UNIONs so it's not easy to take a completely different approach.
This query should perform much better. You are not just performing the same query twice; since they are correlated subqueries, they will run once per row.
SELECT Id, Title,
coalesce(t2.Count, t3.Count, 0) AS RelatedCount
FROM Table1 t
left outer join (
SELECT RelatedId, count(*) as Count
FROM Table2
group by RelatedId
) t2 on t1.Id = t2.RelatedId
left outer join (
SELECT RelatedId, count(*) as Count
FROM Table3
group by RelatedId
) t3 on t1.Id = t3.RelatedId

Resources