How to join two Tables which have no unique ID

How to join two Tables which have no unique ID - sql-server

I am trying to Left join Table 1 to table 2
Table1 Table2
ID Data ID Data2
1 r 1 q
2 t 1 a
3 z 2 x
1 u 3 c
After i have left joined this two Tables i would like get something like this
Table1+2
ID Data Data2
1 r a
2 t x
3 z c
1 u q
and NOT
Table1+2
ID Data Data2
1 r q
2 t x
3 z c
1 u q
and my question is: is there any possibility to tell table 2 that if u have used something for table 1, dont use it and give me next value. do i have to make it im T-SQL or to and new column where i can list if this id exists then write 2 if not 1(Number data). How can i solve this problem?
Ty u in advance.

Since there's no uniqueness in the ID on both tables, lets add some uniqueness to it.
So it can be used to join on.
The window function ROW_NUMBER can be used for that.
An example solution that gives the expected result:
DECLARE #TestTable1 TABLE (ID INT, Data VARCHAR(1));
DECLARE #TestTable2 TABLE (ID INT, Data VARCHAR(1));
INSERT INTO #TestTable1 VALUES (1,'r'),(2,'t'),(3,'z'),(1,'u');
INSERT INTO #TestTable2 VALUES (1,'q'),(1,'a'),(2,'x'),(3,'c');
select
t1.ID, t1.Data,
t2.Data as Data2
from (
select ID, Data,
row_number() over (partition by ID order by Data) as rn
from #TestTable1
) t1
left join (
select ID, Data,
row_number() over (partition by ID order by Data) as rn
from #TestTable2
) t2 on t1.ID = t2.ID and t1.rn = t2.rn;
Note: Because of the LEFT JOIN, this does assume the amount of same ID's in table2 are equal or lower can those on table1. But you can change that to a FULL JOIN if that's not the case.
Returns :
ID Data Data2
1 r a
1 u q
2 t x
3 z c
To get the other result could have been achieved in different ways.
This is actually a more common situation.
Where one wants all from Table 1, but only get one value from Table 2 for each record of Table 1.
1) A top 1 with ties in combination with a order by rownumber()
select top 1 with ties
t1.ID, t1.Data,
t2.Data as Data2
from #TestTable1 t1
left join #TestTable2 t2 on t1.ID = t2.ID
order by row_number() over (partition by t1.ID, t1.Data order by t2.Data desc);
The top 1 with ties will only show those where the row_number() = 1
2) Using the row_number in a subquery:
select ID, Data, Data2
from (
select
t1.ID, t1.Data,
t2.Data as Data2,
row_number() over (partition by t1.ID, t1.Data order by t2.Data desc) as rn
from #TestTable1 t1
left join #TestTable2 t2 on t1.ID = t2.ID
) q
where rn = 1;
3) just a simple group by and a max :
select t1.ID, t1.Data, max(t2.Data) as Data2
from #TestTable1 t1
left join #TestTable2 t2 on t1.ID = t2.ID
group by t1.ID, t1.Data
order by 1,2;
All 3 give the same result:
ID Data Data2
1 r q
1 u q
2 t x
3 z c

Use ROW_NUMBER
DECLARE #Table1 TABLE (ID INT, DATA VARCHAR(10))
DECLARE #Table2 TABLE (ID INT, DATA VARCHAR(10))
INSERT INTO #Table1
VALUES
(1, 'r'),
(2, 't'),
(3, 'z'),
(1, 'u')
INSERT INTO #Table2
VALUES
(1, 'q'),
(1, 'a'),
(2, 'x'),
(3, 'c')
SELECT
A.*,
B.DATA
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY(SELECT NULL)) RowId FROM #Table1) A INNER JOIN
(SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY(SELECT NULL)) RowId FROM #Table2) B ON A.ID = B.ID AND
A.RowId = B.RowId

Related

T-SQL Left join twice

Query below works as planned, it shows exactly the way i joined it, and that is fine, but problem with it, is that if you have more "specialization" tables for users, something like "Mail type" or anything that user can have more then one data ... you would have to go two left joins for each and "give priority" via ISNULL (in this case)
I am wondering, how could I avoid using two joins and "give" priority to TypeId 2 over TypeId 1 in a single join, is that even possible?
if object_id('tempdb..#Tab1') is not null drop table #Tab1
create table #Tab1 (UserId int, TypeId int)
if object_id('tempdb..#Tab2') is not null drop table #Tab2
create table #Tab2 (TypeId int, TypeDescription nvarchar(50))
insert into #Tab1 (UserId, TypeId)
values
(1, 1),
(1, 2)
insert into #Tab2 (TypeId, TypeDescription)
values
(1, 'User'),
(2, 'Admin')
select *, ISNULL(t2.TypeDescription, t3.TypeDescription) [Role]
from #Tab1 t1
LEFT JOIN #Tab2 t2 on t1.TypeId = t2.TypeId and
t2.TypeId = 2
LEFT JOIN #Tab2 t3 on t1.TypeId = t3.TypeId and
t3.TypeId = 1

The first problem is determining priority. In this case, you could use the largest TypeId, but that does not seem like a great idea. You could add another column to serve as a priority ordinal instead.
From there, it is a top 1 per group query:
using top with ties and row_number():
select top 1 with ties
t1.UserId, t1.TypeId, t2.TypeDescription
from #Tab1 t1
left join #Tab2 t2
on t1.TypeId = t2.TypeId
order by row_number() over (
partition by t1.UserId
order by t2.Ordinal
--order by t1.TypeId desc
)
using common table expression and row_number():
;with cte as (
select t1.UserId, t1.TypeId, t2.TypeDescription
, rn = row_number() over (
partition by t1.UserId
order by t2.Ordinal
--order by t1.TypeId desc
)
from #Tab1 t1
left join #Tab2 t2
on t1.TypeId = t2.TypeId
)
select UserId, TypeId, TypeDescription
from cte
where rn = 1
rextester demo for both: http://rextester.com/KQAV36173
both return:
+--------+--------+-----------------+
| UserId | TypeId | TypeDescription |
+--------+--------+-----------------+
| 1 | 2 | Admin |
+--------+--------+-----------------+

Actually I don't think you don't need a join at all. But you have to take the max TypeID without respect to the TypeDescription, since these differences can defeat a Group By. So a workaround is to take the Max without TypeDescription initially, then subquery the result to get the TypeDescription.
SELECT dT.*
,(SELECT TypeDescription FROM #Tab2 T2 WHERE T2.TypeId = dT.TypeId) [Role] --2. Subqueries TypeDescription using the Max TypeID
FROM (
select t1.UserId
,MAX(T1.TypeId) [TypeId]
--, T1.TypeDescription AS [Role] --1. differences will defeat group by. Subquery for value later in receiving query.
from #Tab1 t1
GROUP BY t1.UserId
) AS dT
Produces Output:
UserId TypeId Role
1 2 Admin

SQL Server Joining Between Two Tables Based on Closest Value

I have two sets of data, as shown below.
Table 1:
enter code here
ID Value_1
1 233.67
2 83.28
3 84.49
4 1234.83
Table 2:
NewID Value_3 Value_4
5 NULL 83
6 NULL 85
7 NULL 235
I want to join the two tables in such a way that the resulting data set would look like below.
ID NewID Value_1 Value_2
1 7 233.67 235
2 5 83.28 83
3 6 84.49 85
4 NULL 1234.83 NULL
I know that using the ROUND command would cause future problems. Do any of you know how I could create the above resulting set?

Something like this:
DECLARE #Table1 TABLE
(
Id int,
Value1 float
)
INSERT INTO #Table1
VALUES
(1, 233.67),
(2, 83.28),
(3, 84.49),
(4, 1234.83)
DECLARE #Table2 TABLE
(
NewId int,
Value2 float
)
INSERT INTO #Table2
VALUES
(5, 83),
(6, 85),
(7, 235)
SELECT DISTINCT
t1.Id,
FIRST_VALUE(t2.NewId) OVER (PARTITION BY t1.Id ORDER BY ABS(t1.Value1 - t2.Value2) ASC) AS NewId,
t1.Value1,
FIRST_VALUE(t2.Value2) OVER (PARTITION BY t1.Id ORDER BY ABS(t1.Value1 - t2.Value2) ASC) AS Value2
FROM #Table1 t1
CROSS JOIN #Table2 t2
ORDER BY t1.Id
But you will get a result for ID=4 too.

This approach avoids a cross join, which requires every combination to be tested. It also works with SQL Server 2005 and up. It works by calculating a lower and upper bound between the midpoints of each record in Table2 and then joining on Table1 where Value_1 is between the midpoints. If Value_1 lands right on the boundary this code will round up (choose the higher of Value_4 to match).
--Load Table1
select 1 ID, convert(float,233.67) Value_1 into #Table1
insert into #Table1 select 2, 83.28
insert into #Table1 select 3, 84.49
insert into #Table1 select 4, 1234.83
--Load Table2
select 5 NewID, null Value_3, convert(float,83) Value_4 into #Table2
insert into #Table2 select 6, NULL, 85
insert into #Table2 select 7, NULL, 235
;with cte_Table2 as
(
select *, ROW_NUMBER() over (order by Value_4) OrderNum
from #Table2
)
Select #Table1.ID,
NewTable2.NewID,
#Table1.Value_1,
NewTable2.Value_4 Value_2
from #Table1
full join
(
select Table2.NewID,
Table2.Value_3,
Table2.Value_4,
Table2Prev.Value_4 + (Table2.Value_4 - Table2Prev.Value_4) / 2.0 LowerBound,
Table2.Value_4 + (Table2Next.Value_4 - Table2.Value_4) / 2.0 UpperBound
from cte_Table2 Table2
left join cte_Table2 Table2Prev
on Table2.OrderNum = Table2Prev.OrderNum + 1
left join cte_Table2 Table2Next
on Table2.OrderNum = Table2Next.OrderNum - 1
) NewTable2
on (#Table1.Value_1 < UpperBound or UpperBound is null)
and (#Table1.Value_1 >= LowerBound or LowerBound is null)
order by 1
I noticed that you did not show a match for ID of 4 in your expected output. If you need some kind of range to exclude matches then you would have to add that restriction to the ON conditions of the join. For example you could exclude values that are outside a threshold of 10 by making sure you use a FULL JOIN and add this to the ON conditions:
and abs(#Table1.Value_1-NewTable2.Value_4) < 10.0
It occurs to me though that maybe what you are really trying to do is a type of join where not only is the value in Table1 matched with it's closest value in Table2 but the value in Table2 is also the closest value in Table1. In this case, you would have to build a sub-query for Table1 with the boundary conditions and then check for it's closest match as well. Like this:
;with cte_Table1 as
(
select *, ROW_NUMBER() over (order by Value_1) OrderNum
from #Table1
),
cte_Table2 as
(
select *, ROW_NUMBER() over (order by Value_4) OrderNum
from #Table2
)
Select NewTable1.ID,
NewTable2.NewID,
NewTable1.Value_1,
NewTable2.Value_4 Value_2
from
(
select Table1.ID,
Table1.Value_1,
Table1Prev.Value_1 + (Table1.Value_1 - Table1Prev.Value_1) / 2.0 LowerBound,
Table1.Value_1 + (Table1Next.Value_1 - Table1.Value_1) / 2.0 UpperBound
from cte_Table1 Table1
left join cte_Table1 Table1Prev
on Table1.OrderNum = Table1Prev.OrderNum + 1
left join cte_Table1 Table1Next
on Table1.OrderNum = Table1Next.OrderNum - 1
) NewTable1
full join
(
select Table2.NewID,
Table2.Value_3,
Table2.Value_4,
Table2Prev.Value_4 + (Table2.Value_4 - Table2Prev.Value_4) / 2.0 LowerBound,
Table2.Value_4 + (Table2Next.Value_4 - Table2.Value_4) / 2.0 UpperBound
from cte_Table2 Table2
left join cte_Table2 Table2Prev
on Table2.OrderNum = Table2Prev.OrderNum + 1
left join cte_Table2 Table2Next
on Table2.OrderNum = Table2Next.OrderNum - 1
) NewTable2
on (NewTable1.Value_1 < NewTable2.UpperBound or NewTable2.UpperBound is null)
and (NewTable1.Value_1 >= NewTable2.LowerBound or NewTable2.LowerBound is null)
and (NewTable2.Value_4 < NewTable1.UpperBound or NewTable1.UpperBound is null)
and (NewTable2.Value_4 >= NewTable1.LowerBound or NewTable1.LowerBound is null)
order by 1
Now the records in the tables will only match if the values are the closest match both ways. This will ensure that each record in Table1 will be matched to at most 1 value in Table2. Because it's a full join you can get null values on either side... depending on which table is smaller.
One more thing to mention... this code might not handle duplicate values the way you need. If you have more than one of the same value in either Value_1 or Value_4, it will find a match to one of them but not both. If you want that... then you would have to change your common table expressions to:
;with cte_Table1 as
(
select *, ROW_NUMBER() over (order by Value_1) OrderNum
from (select distinct Value_1 from #Table1) tbl1
),
cte_Table2 as
(
select *, ROW_NUMBER() over (order by Value_4) OrderNum
from (select distinct Value_4 from #Table2) tbl2
)
Then update your sub-queries to only output the values with the boundaries. This would give you best matches with between the unique values. Then you could join to Table1 and Table2 ON Table1.Value_1 = NewTable1.Value1 and Table2.Value_4 = NewTable2.Value_4 to get the rest of the fields.
Certainly some optimizations can be made if you have very large tables... like breaking out some of these sub-queries into indexed temporary tables.

How to get running total by ranges?

How to get a running total based on range:
name, Level, value, runningtotal;
A 5 5;
B 1 3 10;
B 2 2 10;
C 1 11;

I can't tell what column you are doing your running total on. But in 2008 the best way of doing a running total is a self join like this:
CREATE TABLE #Test(
ID INT NOT NULL PRIMARY KEY,
AValue INT NOT NULL)
INSERT INTO #Test VALUES (1,4), (2,2), (3,18)
SELECT T1.ID, T1.AValue, SUM(T2.AValue) RunningTotal
FROM #Test T1
JOIN #Test T2 ON T1.ID >= T2.ID
GROUP BY T1.ID, T1.AValue
DROP TABLE #Test

with cte(name,value) as
(
select name,sum(value) as val from A group by name
)
,
cte2 as
(
SELECT c.name,SUM(b.value) as val
FROM cte c inner join
cte b
on b.name <= c.name
GROUP BY c.name
)
select c.name,c.value,b.val from A c inner join cte2 b on c.name=b.name
order by c.name

Combine two tables in SQL Server

I have tow tables with the same number of rows
Example:
table a:
1,A
2,B
3,C
table b:
AA,BB
AAA,BBB,
AAAA,BBBB
I want a new table made like that in SQL SErver:
1,A,AA,BB
2,B,AAA,BBB
3,C,AAAA,BBBB
How do I do that?

In SQL Server 2005 (or newer), you can use something like this:
-- test data setup
DECLARE #tablea TABLE (ID INT, Val CHAR(1))
INSERT INTO #tablea VALUES(1, 'A'), (2, 'B'), (3, 'C')
DECLARE #tableb TABLE (Val1 VARCHAR(10), Val2 VARCHAR(10))
INSERT INTO #tableb VALUES('AA', 'BB'),('AAA', 'BBB'), ('AAAA', 'BBBB')
-- define CTE for table A - sort by "ID" (I just assumed this - adapt if needed)
;WITH DataFromTableA AS
(
SELECT ID, Val, ROW_NUMBER() OVER(ORDER BY ID) AS RN
FROM #tablea
),
-- define CTE for table B - sort by "Val1" (I just assumed this - adapt if needed)
DataFromTableB AS
(
SELECT Val1, Val2, ROW_NUMBER() OVER(ORDER BY Val1) AS RN
FROM #tableb
)
-- create an INNER JOIN between the two CTE which just basically selected the data
-- from both tables and added a new column "RN" which gets a consecutive number for each row
SELECT
a.ID, a.Val, b.Val1, b.Val2
FROM
DataFromTableA a
INNER JOIN
DataFromTableB b ON a.RN = b.RN
This gives you the requested output:

You could do a rank over the primary keys, then join on that rank:
SELECT RANK() OVER (table1.primaryKey),
T1.*,
T2.*
FROM
SELECT T1.*, T2.*
FROM
(
SELECT RANK() OVER (table1.primaryKey) [rank], table1.* FROM table1
) AS T1
JOIN
(
SELECT RANK() OVER (table2.primaryKey) [rank], table2.* FROM table2
) AS T2 ON T1.[rank] = T2.[rank]

Your query is strange, but in Oracle you can do this:
select a.*, tb.*
from a
, ( select rownum rn, b.* from b ) tb -- temporary b - added rn column
where a.c1 = tb.rn -- assuming first column in a is called c1
if there is not column with numbers in a you can do same trick twice
select ta.*, tb.*
from ( select rownum rn, a.* from a ) ta
, ( select rownum rn, b.* from b ) tb
where ta.rn = tb.rn
Note: be aware that this can generate random combination, for example
1 A AA BB
2 C A B
3 B AAA BBB
because there is no order by in ta and tb

SQL Server: How to use UNION with two queries that BOTH have a WHERE clause?

Given:
Two queries that require filtering:
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate desc
And:
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
Separately, these return the IDs I'm looking for: (13, 11 and 12, 6)
Basically, I want the two most recent records for two specific types of data.
I want to union these two queries together like so:
select top 2 t1.ID, t2.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by ReceivedDate desc
union
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by ReceivedDate desc
Problem:
The problem is that this query is invalid because the first select cannot have an order by clause if it is being unioned. And it cannot have top 2 without having order by.
How can I fix this situation?

You should be able to alias them and use as subqueries (part of the reason your first effort was invalid was because the first select had two columns (ID and ReceivedDate) but your second only had one (ID) - also, Type is a reserved word in SQL Server, and can't be used as you had it as a column name):
declare #Tbl1 table(ID int, ReceivedDate datetime, ItemType Varchar(10))
declare #Tbl2 table(ID int, ReceivedDate datetime, ItemType Varchar(10))
insert into #Tbl1 values(1, '20010101', 'Type_1')
insert into #Tbl1 values(2, '20010102', 'Type_1')
insert into #Tbl1 values(3, '20010103', 'Type_3')
insert into #Tbl2 values(10, '20010101', 'Type_2')
insert into #Tbl2 values(20, '20010102', 'Type_3')
insert into #Tbl2 values(30, '20010103', 'Type_2')
SELECT a.ID, a.ReceivedDate FROM
(select top 2 t1.ID, t1.ReceivedDate
from #tbl1 t1
where t1.ItemType = 'TYPE_1'
order by ReceivedDate desc
) a
union
SELECT b.ID, b.ReceivedDate FROM
(select top 2 t2.ID, t2.ReceivedDate
from #tbl2 t2
where t2.ItemType = 'TYPE_2'
order by t2.ReceivedDate desc
) b

select * from
(
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate de
) t1
union
select * from
(
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
) t2
or using CTE (SQL Server 2005+)
;with One as
(
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate de
)
,Two as
(
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
)
select * from One
union
select * from Two

declare #T1 table(ID int, ReceivedDate datetime, [type] varchar(10))
declare #T2 table(ID int, ReceivedDate datetime, [type] varchar(10))
insert into #T1 values(1, '20010101', '1')
insert into #T1 values(2, '20010102', '1')
insert into #T1 values(3, '20010103', '1')
insert into #T2 values(10, '20010101', '2')
insert into #T2 values(20, '20010102', '2')
insert into #T2 values(30, '20010103', '2')
;with cte1 as
(
select *,
row_number() over(order by ReceivedDate desc) as rn
from #T1
where [type] = '1'
),
cte2 as
(
select *,
row_number() over(order by ReceivedDate desc) as rn
from #T2
where [type] = '2'
)
select *
from cte1
where rn <= 2
union all
select *
from cte2
where rn <= 2

The basic premise of the question and the answers are wrong. Every Select in a union can have a where clause. It's the ORDER BY in the first query that's giving yo the error.

The answer is misleading because it attempts to fix a problem that is not a problem. You actually CAN have a WHERE CLAUSE in each segment of a UNION. You cannot have an ORDER BY except in the last segment. Therefore, this should work...
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
-----remove this-- order by ReceivedDate desc
union
select top 2 t2.ID, t2.ReceivedDate --- add second column
from Table t2
where t2.Type = 'TYPE_2'
order by ReceivedDate desc

Create views on two first "selects" and "union" them.

Notice that each SELECT statement within the UNION must have the same number of columns. The columns must also have similar data types. Also, the columns in each SELECT statement must be in the same order.
you are selecting
t1.ID, t2.ReceivedDate
from Table t1
union
t2.ID
from Table t2
which is incorrect.
so you have to write
t1.ID, t1.ReceivedDate from Table t1
union
t2.ID, t2.ReceivedDate from Table t1
you can use sub query here
SELECT tbl1.ID, tbl1.ReceivedDate FROM
(select top 2 t1.ID, t1.ReceivedDate
from tbl1 t1
where t1.ItemType = 'TYPE_1'
order by ReceivedDate desc
) tbl1
union
SELECT tbl2.ID, tbl2.ReceivedDate FROM
(select top 2 t2.ID, t2.ReceivedDate
from tbl2 t2
where t2.ItemType = 'TYPE_2'
order by t2.ReceivedDate desc
) tbl2
so it will return only distinct values by default from both table.

select top 2 t1.ID, t2.ReceivedDate, 1 SortBy
from Table t1
where t1.Type = 'TYPE_1'
union
select top 2 t2.ID, 2 SortBy
from Table t2
where t2.Type = 'TYPE_2'
order by 3,2

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to join two Tables which have no unique ID - sql-server

Related

T-SQL Left join twice

SQL Server Joining Between Two Tables Based on Closest Value

How to get running total by ranges?

Combine two tables in SQL Server

SQL Server: How to use UNION with two queries that BOTH have a WHERE clause?

Categories

Resources