I have a query that looks to match 6 fields between two tables and return the matches. This query uses inner joins. I've been testing INTERSECT to replace this. Thus,
SELECT Field1, Field2, Field3,...,Field6
FROM TableA
INTERSECT
SELECT Field1, Field2, Field3,...,Field6
FROM TableB
However, I want to add TableA.ID and TableB.ID to the results and can't quite get it written where the query won't take forever (relative to the original query). Any thoughts?
UPDATE:
Regretfully, I added the comment about performance when the first concern is how to properly write the query. As you can see, I believe the main issue is the very bad and incorrect query structure that I just haven't been able to improve. My attempt:
SELECT a.ID, b.ID
FROM TableA a
INNER JOIN
(
SELECT Field1, Field2, Field3,...,Field6
FROM TableA
INTERSECT
SELECT Field1, Field2, Field3,...,Field6
FROM TableB
) Dupes ON
(
a.Field1 = Dupes.Field1 and
...
a.Field6 = Dupes.Field6
)
INNER JOIN TableB b
(
b.Field1 = Dupes.Field1 and
...
b.Field6 = Dupes.Field6
)
A simple INNER JOIN on all 6 fields will return all records from both tables having common field values:
SELECT A.ID, B.ID, A.Field1, A.Field2, A.Field3, A.Field4,
A.Field5, A.Field6
FROM TableA AS A
INNER JOIN TableB AS B
ON A.Field1 = B.Field1 AND A.Field2 = B.Field2 AND A.Field3 = B.Field3 AND
A.Field4 = B.Field4 AND A.Field5 = B.Field5 AND A.Field6 = B.Field6
To optimize the above query you need to set indices on all columns used in the ON clause of the INNER JOIN.
Related
I have a SQL query that looks like this
Select a.*
From table1 a
where a.ColumnName in
(Select MAX(b.ColumnName)
from table2 b
where b.ColumnName2 in
(
Select MAX(c.columnName)
from table3 c
Group by c.ColumnName2
)
Group by b.ColumnName2
)
I am trying to write this in a join statement. I am positive inner join is what I need to get the right information. If someone could translate this to a join statement, I would be really glad.
Thank you.
EDIT 1:
I tried the typical Join statement that a rookie would.
Select a.*
from table1 a
inner join table2 b
on a.columnname = (Select max(b.columnName) from table2)
inner join table3 c
on b.columnName = (select max(c.columnName) from table3)
Obviously, that didn't work because I get 100,000+ results when I should be getting 800. I tried using an alias for table2 and table3 INSIDE the subselect statements and selecting the columnname using THAT alias like this:
Select max(bPart.columnName from table2 bPart)
Select max(cPart.columnName from table3 cPart)
Still the same result.
PERHAPS....
Though I'm not sure why a join is needed. Performance wise exists would likely be fastest, and since you're not returning values from table2 or 3 it seems like it would be the best approach.
SELECT a.*
FROM table1 a
INNER JOIN (SELECT MAX(ColumnName) MColumnName, columnname2
FROM table2
GROUP BY columnName2) B
ON A.columnName = B.mColumnName
INNER JOIN (SELECT MAX(columnName) mColumnName
FROM table3
GROUP BY ColumnName2) C
ON B.columname2 = C.MColumnName
I have a SQL Server query for an inner join...
SELECT *
FROM tableA
INNER JOIN tableB on tableA.my_id = tableB.my_id
How would I find all the records that did NOT match in this join?
You can use a FULL JOIN to combine the two tables, then use a WHERE clause to filter the results down to only non-matching rows by checking for a NULL in each tables primary key value.
Full outer join All rows in all joined tables are included, whether they are matched or not.
SELECT a.pk, b.pk
FROM tableA a
FULL JOIN tableB b ON a.pk=b.fk
WHERE
a.pk IS NULL
OR b.pk IS NULL
SELECT A2.* FROM TableA A2
WHERE A2.my_id NOT IN
(Select tableA.my_id FROM
tableA
inner join
tableB
on tableA.my_id = tableB.my_id)
you could similarly do the above starting SELECT B2.* FROM TableB B2, in order to separately query unmatched records in Table B
if you want all records in one table you could UNION ALL the two queries, depending on the table field structures being the same or how you specify the fields you select - what are you doing with the data?
SELECT * FROM tableA where my_id NOT IN (SELECT my_id from tableB)
UNION ALL
SELECT * FROM tableB where my_id NOT IN (SELECT my_id from tableA)
I have 3 Tables:
TableA(IDRem, IDPed, IDOP)
TableB(IDOP, IDPed)
TableC(IDPed, InvoiceDate)
I need a select to JOIN the records of TableA and TableC, but there are two possible conditions:
IF IDPed on TableA IS NOT NULL, then join directly to TableC by IDPed
ID IDPed on TableA IS NULL, then join to TableB by IDOp, and then join TableB to TableC by IDPed
So far i try this:
SELECT
TableA.*
,(CASE WHEN TableC.InvoiceDate IS NULL
THEN TableC2.InvoiceDate
ELSE TableC.InvoiceDate
END) AS InvoiceDate
FROM
TableA
LEFT JOIN TableC on TableA.IDPed = TableC.IDPed
LEFT JOIN TableB on TableB.IDOp = TableA.IDOp
INNER JOIN TableC as TableC2 on TableC2.IDPed = TableB.IDPed
The problem with this is that every field o tableA I want to include in the select I need to do a case...when to determine if the origin is tableA or TableA2.
Is there a better way of doing this? Thanks!
For complex joins you are better served in TSQL using cross apply:
When should I use Cross Apply over Inner Join?
select IDRem, IDPed, IDOP from TableA a
cross apply(
select IDOP, IDPed from TableB binner
where a.IDop = binner.IDop
) b
cross apply(
select IDPed, InvoiceDate cinner
where b.IDPed = cinner.IDPed
) c
where ...
Strictly psuedo-code but should give you a start.
You could try doing both JOINs and use UNION ALL
Edit: As pointed out in the comment by Vladimir Baranov, there's no need to check if a.IdPed IS NULL on the first SELECT since if it is, the JOIN would return no rows.
SELECT
a.*,
c.InvoiceDate
FROM TableA a
INNER JOIN TableC c ON c.IdPed = a.IdPed
UNION ALL
SELECT
a.*,
c.InvoiceDate
FROM TableA a
INNER JOIN TableB b ON b.IdOp = a.IdOP
INNER JOIN TableC c ON c.IdPed = b.IdPed
WHERE a.IdPed IS NULL
I have two query which has successfully inner join
select t1.countResult, t2.sumResult from (
select
count(column) as countResult
from tableA join tableB
on tableA.id = tableB.id
group by name
)t1 inner join (
select
sum(column) as sumResult
from tableA
join tableB
on tableA.id = tableB.id
group by name
)t2
on t1.name= t2.name
The above query will return me the name and the corresponding number of count and the sum. I need to do a comparison between the count and sum. If the count doesnt match the sum, it will return 0, else 1. And so my idea was implementing another outer layer to wrap them up and use CASE WHEN. However, I've failed to apply an outer layer just to wrap them up? This is what I've tried:
select * from(
select t1.countResult, t2.sumResult from (
select
count(column) as countResult
from tableA join tableB
on tableA.id = tableB.id
group by name
)t1 inner join (
select
sum(column) as sumResult
from tableA
join tableB
on tableA.id = tableB.id
group by name
)t2
on t1.name= t2.name
)
Alright the problem can be solved by simply assigning a name to the outer layer.
select * from(
select t1.countResult, t2.sumResult from (
select
count(column) as countResult
from tableA join tableB
on tableA.id = tableB.id
group by name
)t1 inner join (
select
sum(column) as sumResult
from tableA
join tableB
on tableA.id = tableB.id
group by name
)t2
on t1.name= t2.name
) as whatever //SQL Server need a name to wrap
Hope it will help any newbie like me
Ok, so far you have selected everything your first select has generated (kinda useless, but a start for what you want ;) )
SELECT CASE
WHEN countresult=sumresult THEN 'Equal'
ELSE 'Not'
END
FROM ( --your join select --
)
I don't have any sample data to test this so can just go on your code.
Your queries for t1 & t2 look identical - why don't you just do a sum & count in 1 step?
SELECT COUNT(column) AS countResult
,SUM(column) AS sumResult
FROM tableA INNER JOIN tableB
ON tableA.id = tableB.id
GROUP BY name
Also, as you mention you are a newb - read up on Common Table Expressions in SQL Server.
Before SQL 2005 you had to write these convoluted queries within queries within...
Get into the habit of using CTEs now.
I need a query that returns rows from table A if value X in table A is not the same as the sum of value Y from corresponding row(s) in table B. The issue is that there may or may not be rows in table B that correspond to the rows in table A, but if there no rows in table B, then the rows from table A should still be returned (because there is not a matching value in table B.) So it is like a LEFT OUTER join scenario, but the extra complication of having a comparison as an additional selection criteria.
I have a query that does the opposite, ie. returns rows if the value in table A is the same as the value of row(s) in table B, but sadly this isn't what I need!
SELECT TableA.id, TableA.bdate
FROM TableA
LEFT JOIN TableB ON TableB.ID = TableA.id
WHERE TableA.select_field = 408214
AND TableA.planned_volume =
(select sum(actual_volume)
from
TableB
where TableB.id = TableA.id)
ORDER BY TableA.id
Any help greatly appreciated.
How about something like this:
SELECT TableA.Id, TableA.bdate
FROM TableA
LEFT JOIN
(
SELECT Id, SUM(actual_volume) AS b_volume
FROM TableB
GROUP BY Id
) AS TableBGrouping
ON TableBGrouping.Id= TableA.Id AND TableA.planned_volume <> b_volume
ORDER BY TableA.Id
WITH TotalVolumes
AS
(
SELECT id, SUM(actual_volume) AS total_volume
FROM TableB
GROUP
BY id
)
SELECT id, bdate, planned_volume
FROM TableA
EXCEPT
SELECT A.id, A.bdate, T.total_volume
FROM TableA AS A
JOIN TotalVolumes AS T
ON A.id = T.id;
SELECT TableA.id, TableA.bdate
FROM TableA
LEFT JOIN TableB ON TableB.ID = TableA.id
AND TableA.planned_volume <>
(select sum(actual_volume)
from
TableB
where TableB.id = TableA.id)
ORDER BY TableA.id