Odd LEFT JOIN error - sql-server

I have two tables I am trying to join together on a particular column (shared between the two). The data in these columns are supposed to represent numbers, but the data are actually characters, and some of the values are non-numeric (e.g. '2,3,4', or 'n/a'). I am ignoring the rows with non-numeric values of this column. I am treating the columns as numeric in the join because '001' must match '1', '01', '0001', etc. Inner joining them works, left outer joining them doesn't:
SELECT *
FROM Table1 T1
INNER JOIN Table2 T2
ON T1.ID NOT LIKE '%[^ 0-9]%'
AND T2.ID NOT LIKE '%[^ 0-9]%'
AND T1.ID + 0 = T2.ID + 0
-- Success!
SELECT *
FROM Table1 T1
LEFT JOIN Table2 T2
ON T1.ID NOT LIKE '%[^ 0-9]%'
AND T2.ID NOT LIKE '%[^ 0-9]%'
AND T1.ID + 0 = T2.ID + 0
-- Conversion failed when converting the nvarchar value '2,3,4' to data type int.
Why am I getting an error on the outer join but not the inner join?
P.S.: Fixing the data is not an option. It is not my data; I cannot touch it. I have to find a way to work with it.
EDIT: I am running SQL Server 2008 R2 RTM

SQL Server does not guarantee the order in which the conditions will be evaluated. In your case, T1.ID + 0 = T2.ID + 0 is being evaluated before the NOT LIKE conditions.
Please try the following (SQL 2012 and above):
SELECT *
FROM Table1 T1
LEFT JOIN Table2 T2
ON TRY_CAST(T1.ID AS int) = TRY_CAST(T2.ID AS int)
SQL 2008:
SELECT *
FROM (SELECT * FROM Table1 WHERE ID NOT LIKE '%[^ 0-9]%') T1
LEFT JOIN (SELECT * FROM Table2 WHERE ID NOT LIKE '%[^ 0-9]%') T2
ON CAST(T1.ID AS INT) = CAST(T2.ID AS INT)
Reference
TRY_CAST (T-SQL)

what if you try this instead? does it do what you want?
SELECT *
FROM Table1 T1
LEFT JOIN Table2 T2
ON (isnumeric(T1.ID) = 1
AND isnumeric(T2.ID) = 1)
AND try_parse(T1.ID as int) + 0 = try_parse(T2.ID as int) + 0

Related

MSSQL - Adding condition on the where and on clause performance

Consider the following two queries:
select *
from
table1 t1
left join
table2 t2
on t1.Id = t2.t1Id and (t1.Status = 1 or t2.Id is not null)
And this one
select *
from
table1 t1
left join
table2 t2
on t1.Id = t2.t1Id
where
t1.Status = 1 or t2.Id is not null
The first one runs in 2 seconds. The second one in 2 minutes. Shouldn't the execution plan be the same?
The query plans are different because the queries (and results) are different.
You're using a LEFT JOIN, so the first query will return rows with NULL values where not in table 2.
The second query will not return those rows.
If it was an INNER JOIN, they would essentially be the same query.
Here the Below Query Returns all the "Table1" results with additional matching Columns based on the "ON Clause" condition.
select * from table1 t1
left join table2 t2
on t1.Id = t2.t1Id and (t1.Status = 1 or t2.Id is not null)
Now, the below query matches the 2 tables and returns the rows based on the ON Clause and an additional WHERE Clause filters the Rows again based on the Condition.
select * from
table1 t1
left join table2 t2 on t1.Id = t2.t1Id
where t1.Status = 1 or t2.Id is not null
Here, Even though we used LEFT JOIN But in this case it acts like an INNER JOIN
So, Here Both the Queries produce Different Result Sets. The Execution Plan Also Vary which results in Different Execution Time.
The best way to deal with an OR is to eliminate it (if possible) or break it into smaller queries. Breaking a short and simple query into a longer, more drawn-out query may not seem elegant, but when dealing with OR problems, it is often the best choice:
select *
from table1 t1
left join table2 t2 t1.Id = t2.t1Id
where t1.Status = 1
union all
select *
from table1 t1
left join table2 t2 t1.Id = t2.t1Id
where t2.Id is not null
You can read more in this article:
https://www.sqlshack.com/query-optimization-techniques-in-sql-server-tips-and-tricks/

Why these queries returns different results?

Both of these sql server queries should return the same count result, but returns different - 8219 and 7876.
Left join should return all rows from left table (8219).
What could be the reason of such result (7876)?
select count(*)
from t1 left join t2 on t1.id=t2.id
where t2.[date]='20191001'
-- returns 7876
select count(*)
from t1 left join t2 on t1.id=t2.id
-- returns 8219
select count(*)
from t2
where [date]<>'20191001' or [date] is null
-- returns 0
Your first query has a LEFT JOIN, but the WHERE clause is turning it into an inner join, because it filters out all NULL values.
The correct solution is to move the condition to the ON clause:
select count(*)
from t1 left join
t2
on t1.id = t2.id and t2.[date] = '20191001';

Invalid t1.id in select list because it is not contained in either an aggregate function or group by clause

I have two table t1 and t2,
Where t1 has the data like:
Id. Name
1. Ab
2. Dc
3. Cd
t2 has the data as given:
Id. Revenue
1. 100
2. 0
3. 200
And my SQL query is:
select t1.id ,t1.name,sum(t2.rev)
from t1
inner join t2 on t1.id= t2.id
where t1.id=100 and t2.Revenue <> 0
group by t1.id
Just put to GROUP BY columns that you want to select:
select t1.id ,t1.name,sum(t2.rev)
from t1
inner join t2 on t1.id= t2.id
where t1.id=100 and t2.Revenue <> 0
group by t1.id, t1.name, t2.rev
The error means that that you've got SUM of t2.rev field - one row, but SQL Server has also many rows with calculated column and it does not know what a row exactly should be chosen.
UPDATE:
If one of your some column has type such as text, ntext or image, then you should cast it to NVARCHAR type:
select t1.id ,t1.name,sum(t2.rev)
from t1
inner join t2 on t1.id= t2.id
where t1.id=100 and t2.Revenue <> 0
group by t1.id, CAST( t1.name AS NVARCHAR(100)), CAST( t2.rev AS NVARCHAR(100))
UPDATE 1:
TEXT, NTEXT and IMAGE are old type of variable and there types are deprecated. So these types be replaced or casted by the corresponding types VARCHAR(MAX), NVARCHAR(MAX) and VARBINARY(MAX).
If you have just one column of type of TEXT, then just CAST just this column:
select t1.id ,t1.name,sum(t2.rev)
from t1
inner join t2 on t1.id= t2.id
where t1.id=100 and t2.Revenue <> 0
group by t1.id, CAST( t1.name AS NVARCHAR(100)), t2.rev
If anything, the column causing the error would be t1.name1 Since the name is completely dependent on the ID, you can artificially add it to the group by clause without harming the query's correctness:
select t1.id ,t1.name,sum(t2.rev)
from t1
inner join t2 on t1.id= t2.id
where t1.id=100 and t2.Revenue <> 0
group by t1.id, t1.name
if you want to display any column and you used group by than you must group by one those column also
select t1.id ,t1.name,sum(t2.Revenue) from t1
inner join t2 on t1.id= t2.id where t1.id=1 and t2.Revenue <> 0 group by t1.id,t1.Name

insert records except in sql server

I have two tables t1 and t2.
t1 having 10k records and t2 having 2k records. The 2k records of t2 is present in t1.
I wanted the 8k different records from t1 which is not present in t2.
I'm doing this as below:
select id, second_telphon from t1
except
select id, second_telphon from t2
However, I'm still getting all the 10k records. Is "except" keyword not working?
how can I achieve this?
you can perform a Join to get the unique data from the tables .
like the tables t1 & t2 both you cna perform left or right join .
example:
SELECT T1.*
FROM T1
WHERE NOT EXISTS(SELECT NULL
FROM T2
WHERE T1.ID = T2.ID
AND T1.Date = T2.Date
AND T1.Hour = T2.Hour)
OR .
SELECT T1.*
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
AND T1.Date = T2.Date
AND T1.Hour = T2.Hour
WHERE T2.ID IS NULL
Try this:
SELECT *
FROM T1
WHERE NOT EXISTS(SELECT id,second_telphon FROM t2)
If ID is a unique value, Try this also:
SELECT *
FROM T1
WHERE ID NOT IN(SELECT ID FROM t2)
You could try a union, followed by an aggregation to restrict to those records in the first table which were not duplicated by the second table:
SELECT id, second_telphon
FROM
(
SELECT id, second_telphon FROM t1
UNION ALL
SELECT id, second_telphon FROM t2
) t
GROUP BY id, second_telphon
HAVING COUNT(*) = 1;
If a record, being defined as an id, second_telphon pair, has a record count of only one after the union, it implies that this record was unique to the first table.
Just do left join
select t1.id,t1.second_telphon from t1
left join t2 on
t1.id = t2.id
and t1.second_telphon =t2.second_telphon
where t2.id is null

T-SQL Join on columns OR fixed value

I'm trying to figure out some basic rules in T-SQL.
What I'm trying to achieve here, is to get only the records from Table1 which has a match in Table2 - AND - all records from Table1 where the 'Valid' column has a value of 1 (=true).
Previously I've done this with two selects and a UNION like this:
SELECT T1.*
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.ID = T2.ID
UNION
SELECT T1.*
FROM Table1 T1
WHERE T1.Valid = 1
But isn't there any other way than using multiple selects and UNION to achieve this?
While fiddling, I did the following code bit, which however only works if there's exactly one match in Table2 (otherwise it'll multiply the records by the number of matches in T2).
SELECT T1.*
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.ID = T2.ID
OR T1.Valid = 1
What would be the best way to achieve my goal in terms of performance?
Also please don't hold back on the comments, possible flaws, or explanations of how and why another solution might be better.
assuming that T1.ID and T2.ID is unique or a primary key:
If there are duplicates you may have to write SELECT DISTINCT T1.*. The UNION operator in the orinal selects only distinct values.
this one should do:
SELECT T1.*
FROM Table1 T1
WHERE T1.ID IN ( SELECT T2.ID FROM Table2 T2 WHERE T2.ID IS NOT NULL)
OR T1.Valid = 1
or
SELECT T1.*
FROM Table1 T1
LEFT JOIN Table2 T2 ON T1.ID = T2.ID
WHERE T2.ID IS NOT NULL OR T1.Valid = 1
but i think, the execution plan will be the same at the end.

Resources