Combining two tables with multiple common columns into one in SQL - sql-server

I am trying to create a table in SQL server which has the same output as the following:
Select *
FROM Table1
LEFT JOIN Table2
ON
Table1.Key1 = Table2.Key1
AND Table1.Key2 = Table2.Key2
The result of the above query is exactly what I need, but as a new table.
The problem is, there are multiple columns that are common between the two tables. I have executed the following code:
Select *
INTO NewTable
FROM Table1
LEFT JOIN Table2
ON
Table1.Key1 = Table2.Key1
AND Table1.Key2 = Table2.Key2
The following error appears:
Msg 2705, Level 16, State 3, Line 1
Column names in each table must be unique. Column name 'Key1' in table 'NewTable' is specified more than once.
Could someone please help? I would highly appreciate it after a long day of searching the internet without any solution.
Thank you so much in advance!

This will help you identify what records you need to get a unique list.
select ',' + Column_Name
from INFORMATION_SCHEMA.COLUMNS c2
where column_Name not in (
select COLUMN_NAME
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = 'table1')
and table_Name = 'Table2'
So you can safely say:
Select table1.*
<<Paste in your results from above here>>
INTO NewTable
FROM Table1
LEFT JOIN Table2
ON
Table1.Key1 = Table2.Key1
AND Table1.Key2 = Table2.Key2

When you write select into table query then it dynamically create the table and at the time of creating a table the name of column should be unique.
In your case when you join then name of column can be in both table.
So replace * and write it as shown below
Select Column1, Column2, ... etc
INTO NewTable
FROM Table1
LEFT JOIN Table2
ON
Table1.Key1 = Table2.Key1
AND Table1.Key2 = Table2.Key2

Your easiest option would be to change your select-into statement into something like this, where you give unique names to the fields with the same name:
Select Table1.Key1 as [Key1a], Table2.Key1 as [Key1b], etc.
INTO NewTable
FROM Table1
LEFT JOIN Table2
ON
Table1.Key1 = Table2.Key1
AND Table1.Key2 = Table2.Key2
If you are using SSMS, you could highlight your query, right-click on it and select "Design in Query Editor" and it will show you the select statement as "select [all of the fields]" rather than "select *", which will probably be useful to you.

Related

GROUP BY T1.* ? Group by all columns in Table1, joined left by table 2, and Aggregate functions on T2 columns?

I have a query that is merging 2 tables. Table 1 has many columns, and may eventually expand. Table 2 also has several columns, but I will be performing aggregate functions on 90% of its columns. Table 1 has 300 + rows, Table 2 has 84K + rows.
SELECT
t1.*
,t2.c2
,SUM(t2.c3)
,SUM(t2.c4)
FROM
Table1 AS t1
LEFT JOIN Table2 AS t2 ON t1.c10 = t2.c1
GROUP BY
t1.*
,t2.c2
I'm getting an error Incorrect Syntax near '*' and it points to the line containing the GROUP BY statement.
I am aware that the SELECT t1.* works as I ran this portion prior to trying to aggregate T2 columns and it worked as expected.
Is there a way to quickly GROUP BY all the columns in T1? I know normally we would select only needed columns, but in this case, I need all the T1 columns.
Previous research has led me to only find instances where 1 table was used, and mostly people were looking to get or remove duplicate values. I'm looking to specifically combine the 300 records of T1 to the 84K records of T2 without having to name off all the columns from T1 in the GROUP BY section.
This method is slightly unconventional, but you can pass it into a variable by using dynamic sql. Below is an example of how you can do it:
declare #test nvarchar(max)
set #test = ''
select #test += Column_name +',' from information_schema.columns where table_name='Table1'
DECLARE #sql nvarchar(max)
SELECT #sql = N'SELECT top 10 ' +#test+ 'NULL as a FROM Table1;'
EXEC sp_executesql #sql
You can apply the same principle and rewrite your query to use the group by function. Hope this helps.
Based on the article posted by #wosi, https://dba.stackexchange.com/questions/21226/why-do-wildcards-in-group-by-statements-not-work, I was able to modify the code and get the expected results. Please note I went from 80K to 70K because I was joining the tables on 1 column. The way my data was structured I had to join on 2 columns. Final code looks something like this:
SELECT
t1.*
,t2.c2
,t2.c3
,t2.c4
FROM
Table1 AS t1
LEFT JOIN
(SELECT c2, SUM(c3), SUM(c4)
FROM Table2
GROUP BY c2) AS t2
ON t1.c10 = t2.c1 AND t1.c15 = t2.c2
You can't use * in GroupBy Statement. Of course, there are some Dynamic SQL to prevent typing all columns in the SP but if you are using T-SQL in a view you should type all columns.

Remove from other SQL table after using the Match function?

I found this code online and like its use for inserting data based on common column variables.
Select * from Table1
Merge into table1 as T using [table] as S
on T.[Last Name] = S.[Last Name] and T.[First Name] = S.[First Name]
When Matched then Update Set T.[DOB] = S.[DOB];
Problem is I want to get rid of the data that matched up from the source. So, once the information is has been matched and inserted into the target I want to delete the matched information from the source.
After the merge statement you can do a delete statement using inner join:
DELETE T1
FROM Table1 T1 INNER JOIN Table2 T2
ON T1.[Last Name] = T2.[Last Name] AND T1.[First Name] = T2.[Last Name];
Note: By default SQL table names are not case-sensitive, so both Table1 and table1 will refer to the same table. So please change the table
name.

Find missing values on the same column of two tables

Suppose you have two tables in a SQL Server database with the same schema for both tables. I want to compare a single column on both tables and find the values that are missing in table1 but are in table2. I've been doing this manually in Excel with a macro after I've gotten a distinct list in each query, but it would be less work if I had a query. How can I find the missing records via T-SQL? I'd like to do this for the following data types: datetime, nvarchar & bigint.
SELECT DISTINCT [dbo].[table1].[column1]
FROM [dbo].[table1]
ORDER BY [dbo].[table1].[column1] DESC
SELECT DISTINCT [dbo].[table2].[column1]
FROM [dbo].[table2]
ORDER BY [dbo].[table2].[column1] DESC
There are several ways you can do this...
LEFT JOIN:
SELECT DISTINCT t2.column1
FROM dbo.table2 t2
LEFT JOIN dbo.table1 t1
ON t2.Column1 = t1.Column1
WHERE t1.Column1 IS NULL
NOT EXISTS:
SELECT DISTINCT t2.column1
FROM dbo.table2 t2
WHERE NOT EXISTS (
SELECT 1
FROM dbo.table1 t1
WHERE t1.column1 = t2.column1
)
NOT IN:
SELECT DISTINCT t2.column1
FROM dbo.table2 t2
WHERE t2.column1 NOT IN (
SELECT t1.column1
FROM dbo.table1 t1
)
There are some slight variations in the behavior and efficiency of these approaches... based mostly on the presence of NULL values in columns, so try each approach to find the most efficient one that gives the results you expect.
SELECT DISTINCT [dbo].[table2].[column1]
FROM [dbo].[table2]
except
SELECT DISTINCT [dbo].[table1].[column1]
FROM [dbo].[table1]
All the values of column1 in Table2 that are not present in column1 of Table1
basically, you can use LEFT JOIN.
TableB is set as the main table in this case. By joining it with TableA using LEFT JOIN, the the records that have no match on TableA a will still be in the result list but their values are NULL. So to filter out non matching records, add a filtering condition which only select records with NULL value on tableA.
SELECT b.*
FROM tableB b
LEFT JOIN tableA a
ON a.column1 = b.column1
WHERE a.column1 IS NULL
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
SQL Server 2005 onwards you could use Except
SELECT DISTINCT [dbo].[table2].[column1]
FROM [dbo].[table2]
Except
SELECT DISTINCT [dbo].[table1].[column1]
FROM [dbo].[table1]

Display the table name in the select statement

I need to display the table name in the select statement. how?
exact question:
we have common columns in two tables. we are displaying the records by using
select column_name from table_name_1
union
select column_name from table_name_2
But the requirement is, we need to display the source table_name along with the data.
consider a,c are present in table_1 and b,d are present in table_2.
we need the output in the following way
eg:
column_name table_name
a table_1
b table_2
c table_1
d table_2
.......................................................
......................................................
Is this possible
select 'table1', * from table1
union
select 'table2',* from table2

SQL Server (2005+) query to return the base table and base column (field) for each column (field) in a view

I want a query that will return a row for each column in a view, and a row for the view itself.
There should be a column basetable in the result that gives the base table for the column in the current row, and a column basefield in the result that gives the name of the column in the underlying query (for renamed columns). It would be a bonus if any calculations could also be included in the basefield column.
I don't think this can be done. Am I wrong?
In the example below "what goes here" should be replaced by table1 or table2 as appropriate in the basetable column, and a, b, or c as appropriate in the basefield column.
create table table1 (a int, b int)
create table table2 (a int, c int)
go
create view view1 as select table1.a, table1.b, table2.c from table1 left join table2 on table1.a = table2.a
go
select * from
(
select 'View' objecttype,O.name viewname,'' fieldname,0 column_id,'' typename,'' max_length,'' [precision], '' scale, '' is_identity,
'what goes here' basetable, '' basefield
from sys.objects O where O.type='V' and O.[schema_id] = 1
union all
select 'Field' objecttype,object_name(C.[object_id]) viewname,C.name fieldname,C.column_id,T.name typename,C.max_length,C.precision,C.scale,C.is_identity,
'what goes here' basetable, 'what goes here' basefield
from sys.columns C
left join sys.types T on C.user_type_id=T.system_type_id
where C.[object_id] in (select O.[object_id] from sys.objects O where O.type='V')
) I
where viewname in ('view1')
order by viewname, column_id
drop view view1
drop table table1
drop table table2
There are few tables in the information schema that you can use to deduce the information. A basic query that gives you quite a bit of data will be:
select *
from INFORMATION_SCHEMA.VIEW_COLUMN_USAGE v
inner join INFORMATION_SCHEMA.COLUMNS v1
on v.VIEW_NAME=v1.TABLE_NAME and v.COLUMN_NAME=v1.COLUMN_NAME
where v.VIEW_NAME='My_View'
The tables in the information schema are:
select * from INFORMATION_SCHEMA.VIEWS
select * from INFORMATION_SCHEMA.VIEW_TABLE_USAGE
select * from INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
So try using these.
However it works only when you use columns directly from the base tabbles without any formula ar deriving.
You know it might not map 1 to 1? A column in a view might be the result of several (or even none!) columns from different tables.
That said, it should be possible by parsing the source for the view. But the sql code would be procedural/imperative in nature and not at all trivial.

Resources