SQL - Using a join on a column with multiple values in cells - sql-server

I have two tables I am trying to join. One table has a column with IDs in it, and I am trying to do a left join to a different table that has the same IDs in it, although the second table could contain more than one ID per cell. For example, if my first table has an ID value of 123, and the second table has an ID value of 123;724;823, is there any way to get it to join the two rows?

You tried in query designer? Is very easy to make joins there.
SELECT column_names
FROM table-name1 LEFT JOIN table-name2
ON ID_column-name1 = ID_Column-name2
WHERE condition X,Y,Z
Hope will help you.

select *
from
(
select '123' as id
union select '124'
) as t1
left join
(
select '123;001;002' as id
union select '001;123;002'
union select '001;002;123'
) as t2 on
t2.id = t1.id
or t2.id like t1.id + ';%'
or t2.id like '%;' + t1.id + ';%'
or t2.id like '%;' + t1.id

Using the multiple like operators is probably the fastest way, but if you have a string splitter function like this one DelimitedSplit8K, you can split the values out into a table and join to it.
SELECT *
FROM table1 t1
LEFT JOIN (
SELECT *
FROM table2 t2
OUTER APPLY (
SELECT *
FROM dbo.[DelimitedSplit8K] (t2.id,';') -- splits the values in multi id column
) t
) t ON t.Item = t1.id -- t.Item is the value generated from the DelimitedSplit8K TVF

Related

MSSQL SP: Select all rows from one table using IDs from another

Perhaps quite a simple question, that seems to have a rather complicated answer that I have not been able to dig out.
Im using an SQL-server 2012.
I have these two statements, that selects all my data based on a parameter, and then also selects up to five rows of data (which means no joins) from another table based on the IDs gotten from the first select.
SELECT * FROM TBL1 WHERE XXX
SELECT * FROM TBL2
WHERE TBL1_ID IN (SELECT ID FROM TBL1 WHERE XXX)
It seems very redundant to me, that I basicly have to repeat my TBL1 select in my TBL2, and instead I would like to know if I can select from TBL2 using the ID's from the * data I got from TBL1.
I am fully aware that this will most likely result in two resultsets that dont necessarily correlate, but I can generally use PHP array-manipulation to fix this so its not that big of a deal.
You also use EXISTS
SELECT * FROM TBL2
WHERE EXISTS (SELECT 1 FROM TBL1 WHERE TBL1.ID = TBL2.TBL1_ID AND XXX)
Using IN:
Declare #T1 table (ID INT , Value VARCHAR(50) );
Declare #T2 table (ID INT);
INSERT INTO #T1 Values (1,'First') , (2,'Second');
INSERT INTO #T2 Values (1),(3);
SELECT * FROM #T1 WHERE ID IN (SELECT ID FROM #T2);
Resault :
ID Value
----------- ------------------------------------------------------------------------------
1 First
Using INNER JOIN :
SELECT T1.ID , T1.Value FROM #T1 T1 INNER JOIN #T2 T2 ON T1.ID = T2.ID;
Resault:
ID Value
----------- ------------------------------------------------------------------------------
1 First
Using LEFT JOIN :
SELECT T1.ID , T1.Value FROM #T1 T1 LEFT JOIN #T2 T2 ON T1.ID = T2.ID
WHERE T2.ID IS NOT NULL;
Resault :
ID Value
----------- -------------------------------------------------------------------------------
1 First

How to INNER JOIN multiple values

SELECT t1.*, t2.name as song_name
FROM table1 as t1
INNER JOIN table2 as t2
ON t1.song_name_id = t2.id
WHERE t1.id = '..'
I get following error when t2.id column has two id's like this 6,12. Obviously, works, when record t2.id (nvarchar) is only one. How can I make it work with two/multiple?
Error:
Conversion failed when converting the nvarchar value '6,12' to data type int.
The error is possibly because table1.song_name_id and table2.id have different data types. Please ensure they are of same type or use convert function.
Don't Store CSV data in your table however a temporary solution is given below.
SELECT t1.*, t2.name as song_name
FROM table1 as t1
INNER JOIN table2 as t2
ON (t1.song_name_id = left(t2.id,1))
or(t1.song_name_id = right(t2.id,1))
WHERE t1.id = '..'
Different Approach should be handled if the table2 id column contains more than 2 values separated by commas.
As Bishakh Ghosh mentioned, you really should be joining on fields with same types, and also as Coder1991 said, you really shouldn't stored comma separated values in a single field.
Given that the field is in the state that it is, you can use pattern matching to find whether or not t2.id contains the value in t1.song_name_id.
In the below example, the additional commas will allow for "6,12" to become ",6,12," and song_name_id in the like statement will become "%,6,%" which will then find a match.
SELECT
t1.*,
t2.name as song_name
FROM
table1 as t1
INNER JOIN table2 as t2 ON
',' + t2.id + ',' LIKE '%,' + CONVERT(VARCHAR(20),t1.song_name_id) + ',%'
WHERE
t1.id = '..'

Using inner join to reduce results. Do I need to reference new table anywhere beyond the join statement?

I'm wondering if I can just do an inner join as kind of a where clause by itself. Or if I use a field from the joined table in my where clause, if it's redundant.
select * from T1 inner join T2 on T1.id = T2.id where T2.z is not null
Is the "T2.z is null" part redundant if all I want returned are records in T1 where the same id exists in T2?
For one thing, select * from t1 inner join t2 [...] will not return records in t1 - it will return all the columns of t1 and t2. You could fix that by selecting specifically the columns in t1 - don't select *.
Then, if there are many rows in t2 with the same t2.id, matching a given t1.id, you will get a whole bunch of rows in the result for that one row in the input t1. So you will not always "reduce" the result set.
It seems what you want can be achieved with the in operator, something like
select * from t1 where t1.id in (select id from t2);
This is equivalent to the following modification of your query. You do not need a where clause for this to work:
select t1.* from t1 inner join (select distinct id from t2) b on t1.id = b.id;
In the following query,
select t1.* from T1 inner join T2 on T1.id = T2.id where T2.z is NOT null
The WHERE condition is redundant, assuming that T2.Z is a NOT NULL column.
That would leave you with this:
select t1.* from T1 inner join T2 on T1.id = T2.id
, which is a little odd because, in a normally designed database, either T1.id or T2.id would be the primary key of its table.
If T1.id is the primary key of T1, then your query is going to return duplicates -- each T1 row will be repeated once for each child that exists in T2.
If T2.id is the primary key of T2, then you should not need to join to T2 at all, because every possible T1.id value must exist in T2.id, because of the FOREIGN KEY relationship that (should) exist. In that case, you could have written:
select t1.* from T1 WHERE T1.id is not null;
So, the answer to your question is that you do not need to reference the tables outside of the join condition in order for the join to be applied. But something seems a little off about the approach.

T-SQL Join on columns OR fixed value

I'm trying to figure out some basic rules in T-SQL.
What I'm trying to achieve here, is to get only the records from Table1 which has a match in Table2 - AND - all records from Table1 where the 'Valid' column has a value of 1 (=true).
Previously I've done this with two selects and a UNION like this:
SELECT T1.*
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.ID = T2.ID
UNION
SELECT T1.*
FROM Table1 T1
WHERE T1.Valid = 1
But isn't there any other way than using multiple selects and UNION to achieve this?
While fiddling, I did the following code bit, which however only works if there's exactly one match in Table2 (otherwise it'll multiply the records by the number of matches in T2).
SELECT T1.*
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.ID = T2.ID
OR T1.Valid = 1
What would be the best way to achieve my goal in terms of performance?
Also please don't hold back on the comments, possible flaws, or explanations of how and why another solution might be better.
assuming that T1.ID and T2.ID is unique or a primary key:
If there are duplicates you may have to write SELECT DISTINCT T1.*. The UNION operator in the orinal selects only distinct values.
this one should do:
SELECT T1.*
FROM Table1 T1
WHERE T1.ID IN ( SELECT T2.ID FROM Table2 T2 WHERE T2.ID IS NOT NULL)
OR T1.Valid = 1
or
SELECT T1.*
FROM Table1 T1
LEFT JOIN Table2 T2 ON T1.ID = T2.ID
WHERE T2.ID IS NOT NULL OR T1.Valid = 1
but i think, the execution plan will be the same at the end.

Optimize CASE Test in SQL Server

I'm wondering if there's any way to optimize the following SELECT query. (Note: I typed this when writing my question for nonexistent tables and I might not have the correct syntax.)
The goal is, if Table2 contains any related rows I want to set the value of the third column to the number of related rows in Table2. Otherwise, if Table3 contains any related rows I want to set the column to the number of related rows in Table3. Otherwise, I want to set the column value to 0.
SELECT Id, Title,
CASE
WHEN EXISTS (SELECT * FROM Table2 t2 WHERE t2.RelatedId = Table1.Id) THEN
(SELECT COUNT(1) FROM Table2 t2 WHERE t2.RelatedId = Table1.Id)
WHEN EXISTS (SELECT * FROM Table3 t3 WHERE t3.RelatedId = Table1.Id) THEN
(SELECT COUNT(1) FROM Table3 t3 WHERE t3.RelatedId = Table1.Id)
ELSE 0
END AS RelatedCount
FROM Table1
I don't like the fact that I'm basically performing the same query twice (in two cases). Is there any way to do what I want while only performing the query once?
Note that this is part of a much larger query with multiple JOINs and UNIONs so it's not easy to take a completely different approach.
This query should perform much better. You are not just performing the same query twice; since they are correlated subqueries, they will run once per row.
SELECT Id, Title,
coalesce(t2.Count, t3.Count, 0) AS RelatedCount
FROM Table1 t
left outer join (
SELECT RelatedId, count(*) as Count
FROM Table2
group by RelatedId
) t2 on t1.Id = t2.RelatedId
left outer join (
SELECT RelatedId, count(*) as Count
FROM Table3
group by RelatedId
) t3 on t1.Id = t3.RelatedId

Resources