Union 50+ tables with different number of columns using SQL Server - sql-server

I have over 50 different tables which I would like to combine into one big table. All the tables have a different number of columns.
Currently, to union the tables together, I am writing an individual select statement for each of the tables, and inserting a null column if that column doesn't exist in the table. Then I am using UNION ALL to union them together.
For example:
(
select col1
, null as col2
, col3
from table1
union all
select col1
, col2
, null as col
from table2
)
Although this works, it is very manual and time consuming. Is there a better, more efficient way to union these tables into one? As with over 50 tables, I am going to have thousands of lines of code.
Thank you!

You can query SQL Server metadata, and from the result dynamically construct a SQL statement. This can be done in any programming language, including T-SQL itself.
Here's a rough example; execute this query, copy/paste the result back into the query window, and execute that.
If the 50 tables have similar names (e.g. all start with Foo), then you can replace the exhaustive table list (WHERE TABLE_NAME IN ('table1', 'table2', 'table3') in my example) by WHERE TABLE_NAME LIKE 'Foo%'.
WITH
AllTables (TABLE_NAME) AS (
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME IN ('table1', 'table2', 'table3')
),
TablesWithSelectors (TABLE_NAME, COLUMN_NAME, Selector) AS (
SELECT t.TABLE_NAME, a.COLUMN_NAME, CASE WHEN b.COLUMN_NAME IS NULL THEN 'NULL AS ' ELSE '' END + a.COLUMN_NAME
FROM AllTables t
CROSS JOIN (SELECT DISTINCT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME IN (SELECT TABLE_NAME FROM AllTables)) a
LEFT OUTER JOIN INFORMATION_SCHEMA.COLUMNS b ON b.TABLE_NAME = t.TABLE_NAME AND b.COLUMN_NAME = a.COLUMN_NAME
),
SelectStatements (Sql) AS (
SELECT
'SELECT ' +
STUFF((
SELECT ', ' + Selector
FROM TablesWithSelectors
WHERE TABLE_NAME = r.TABLE_NAME
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
, 1, 2, '') +
' FROM ' +
TABLE_NAME
FROM TablesWithSelectors r
GROUP BY TABLE_NAME
)
SELECT STUFF((
SELECT ' UNION ALL ' + sql
FROM SelectStatements
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)'), 1, 11, '')
Thanks to:
How to use GROUP BY to concatenate strings in SQL Server?

Related

string+ select query order by + string ;

I want output as get all tables from a database with name as 'T_'.
I have written the query ,it gets all the table name.
but I want no get the query as 'select count(*) from all tables in database union'
means I want to get
select count(*) from T1 UNION
select count(*) from T2 UNION
select count(*) from T3 UNION...
and so on
there are 1000 of rows so I want a query which will output the count(*) query itself.
select 'select count(*) from ' + table_name from INFORMATION_SCHEMA.TABLES where table_type='BASE TABLE'
and left(table_name,2) = 'T_'
order by TABLE_NAME
this query gives all select count(*) table names like T_*
select 'select count(*) from ' + table_name from INFORMATION_SCHEMA.TABLES where table_type='BASE TABLE'
and left(table_name,2) = 'T_'
order by TABLE_NAME
+'UNION';
Getting output
select count(*) from T_T1
select count(*) from T_T2
select count(*) from T_T3
expected output
select count(*) from T_T1 UNION
select count(*) from T_T2 UNION
select count(*) from T_T3 UNION
order by TABLE_NAME + ' UNION' means you want want to order by the value of TABLE_NAME with the sting UNION concatenated onto it (which will change nothing)
You need to put the UNION (I actually suggest UNION ALL here) in your SELECT: ...table_type='BASE TABLE' + N' UNION ALL '.
I also suggest changing table_name to QUOTENAME(table_name). Giving you a final query of:
SELECT N'SELECT COUNT(*) FROM' + QUOTENAME(TABLE_NAME) + NCHAR(13) + NCHAR(10) + N'UNION ALL'
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
AND LEFT(TABLE_NAME, 2) = 'T_'
ORDER BY TABLE_NAME;
select 'select count(*) from ' + table_name + ' UNION ' from INFORMATION_SCHEMA ...

Get first row in table while printing out Table name and Column name

I need to get a quick overlook of the data in a MS SQL datase and found the following code which gives me all but the last column I need. This third column should show data from the first row.
SELECT TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS
So my question is how I should formulate the SQL query to get a third column with data from the first row)?
//Update
This code gives me all I want except what table the data comes from. (In a similar question answered by Yaroslav at Select the first 3 rows of each table in a database)
DECLARE #sql VARCHAR(MAX)='';
SELECT #sql=#sql+'SELECT TOP 3 * FROM '+'['+SCHEMA_NAME(schema_id)+'].['+name+']'+';'
FROM sys.tables
EXEC(#sql)
One approach, if I get this correctly, was an undocumented stored procedure sp_MsForeachTable. The questionmark is a placeholder for the table's name there:
Hint: edit "YourDataBase"...
EXEC sp_msforeachtable 'USE YourDataBase;SELECT TOP 1 ''?'' AS TableName, * FROM ?';
With this query you can explore all first rows easily
CREATE TABLE #test(TableName NVARCHAR(MAX),Content XML)
EXEC sp_msforeachtable
'USE YourDataBase;INSERT INTO #test SELECT ''?'' AS TableName, (SELECT TOP 1 * FROM ? FOR XML PATH(''row''))';
SELECT * FROM #test;
UPDATE
Your own code would return the table's name also. Try this
DECLARE #sql VARCHAR(MAX)='';
SELECT #sql=#sql+'SELECT TOP 3 ''' + t.[name] + ''' AS TableName, * FROM '+'['+SCHEMA_NAME(schema_id)+'].['+name+']'+';'
FROM sys.tables t
EXEC(#sql)
I know this isnt much but it build you a select statement, you can then loop through each statement exec each statement or write a union, But I guess it could be a good starting point
Edit: You could also write a loop and exec each statement in a loop insert the value into a final table, Then just select from that table and you should be good.
SELECT
t.String
,t.q
,t.TABLE_NAME
,t.q2
,t.tbname
,t.com2
,t.q4
,t.COLUMN_NAME
,t.q5
,t.Columnname
,t.com3
,t.ColName
,t.[From]
,t.FromSelect
FROM (SELECT
'Select top 1 ' AS String
,'''' q
,TABLE_NAME
,'''' q2
,'as TableName'
as tbname
,',' com2
,'''' q4
,COLUMN_NAME
,'''' q5
,'as COLUMN_NAME'
as Columnname
,',' com3
,COLUMN_NAME as ColName
, 'Value From ' as [From]
,TABLE_NAME as FromSelect
,ROW_NUMBER() OVER (PARTITION BY TABLE_NAME ORDER BY TABLE_NAME DESC, COLUMN_NAME) rn
FROM INFORMATION_SCHEMA.COLUMNS c
) t
WHERE rn = 1;
The result will be something like this for each table.
Select top 1 ' zipcodes ' as TableName , ' CITY ' as COLUMN_NAME , CITY Value From zipcodes
Select top 1 ' _BHCAMERAPRICE ' as TableName , ' _BHID ' as COLUMN_NAME , _BHID Value From _BHCAMERAPRICE

Is it possible to select alle columns from table, except one (eg. ID)?

I am using a third party application which has an absurd number of columns per table. When I select data, often I need all columns except the ID. Or all columns except ID and DateCreated.
Using the sys.columns it's possible to find out which columns are available in a table. How can I use this information to create statements? What would be the best way to do this?
This script will select all columns for any table except the primary key column of the table and the column names DateCreated.
SELECT
'SELECT '+
SUBSTRING(LIST,1,LEN(LIST)-1)
+' FROM [Person].[Address]'
FROM
(
SELECT
'['+COL.COLUMN_NAME+'],'
FROM INFORMATION_SCHEMA.COLUMNS COL
LEFT JOIN
(
SELECT
CON.CONSTRAINT_TYPE,
USG.TABLE_SCHEMA,
USG.TABLE_NAME,
USG.COLUMN_NAME,
CON.CONSTRAINT_NAME,
USG.TABLE_CATALOG
FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE USG
INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS CON
ON USG.CONSTRAINT_NAME = CON.CONSTRAINT_NAME
)Q
ON COL.TABLE_SCHEMA = Q.TABLE_SCHEMA
AND COL.TABLE_NAME = Q.TABLE_NAME
AND COL.TABLE_CATALOG = Q.TABLE_CATALOG
AND COL.COLUMN_NAME = Q.COLUMN_NAME
WHERE COL.TABLE_SCHEMA ='Person'
AND COL.TABLE_NAME = 'Address'
AND
(
Q.CONSTRAINT_TYPE <> 'PRIMARY KEY'
OR
COL.COLUMN_NAME <> 'DateCreated'
)
FOR XML PATH(''))L(LIST)
Replace the string Person with your schema and Address With you table name
try using dynamic Query:
DECLARE #Names VARCHAR(MAX)
SELECT #Names = COALESCE(#Names + ', ', '') + Column_Name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE Table_name='Table1'
And Column_Name!='ID'
DECLARE #Query VARCHAR(MAX)
SELECT #Query = 'SELECT '+ #Names + ' INTO Table2 FROM Table1'
exec (#Query)
SELECT * from Table2

Find table containing two particular columns in SQL Server

I would like to find all the tables containing two particular separate columns in SQL Server.
The first column name is "LIKE '%A%'" (Meaning it contains the substring "A") and the second column name is "LIKE '%B%'" (Meaning it contains the substring "B").
I wrote the following query and I would like to check its correctness:
SELECT s.TABLE_NAME
FROM (SELECT COLUMN_NAME, TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE '%A%'
UNION
SELECT COLUMN_NAME, TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE '%B%') s
WHERE EXISTS (SELECT COLUMN_NAME, s.TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE '%A%')
AND EXISTS (SELECT COLUMN_NAME, s.TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE COLUMN_NAME LIKE '%B%');
That should be easier:
SELECT s.TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES AS s
WHERE s.TABLE_TYPE='BASE TABLE'
AND EXISTS (SELECT 1
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME=s.TABLE_NAME AND COLUMN_NAME LIKE '%A%')
AND EXISTS (SELECT 1
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME=s.TABLE_NAME AND COLUMN_NAME LIKE '%B%');
UPDATE
with this code you will find all columns fitting both criterias as a list...
SELECT s.TABLE_NAME,listA,listB
FROM INFORMATION_SCHEMA.TABLES AS s
CROSS APPLY (SELECT STUFF(
(
SELECT ', ' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME=s.TABLE_NAME AND COLUMN_NAME LIKE '%med%'
ORDER BY ORDINAL_POSITION
FOR XML PATH('')
),1,2,'')
) AS columnsWithA(listA)
CROSS APPLY (SELECT STUFF(
(
SELECT ', ' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME=s.TABLE_NAME AND COLUMN_NAME LIKE 'kli%'
ORDER BY ORDINAL_POSITION
FOR XML PATH('')
),1,2,'')
) AS columnsWithB(listB)
WHERE s.TABLE_TYPE='BASE TABLE'
AND listA IS NOT NULL AND listB IS NOT NULL
UPDATE 2
And with a final AND listA<>listB AND CHARINDEX(',',listA)=0 you would exclude identical listA and listB as long as there is only one column (=> no comma)
Set logic to the rescue!
SELECT DISTINCT TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS AS c
WHERE COLUMN_NAME LIKE '%A%'
AND COLUMN_NAME NOT LIKE '%B%'
INTERSECT
SELECT DISTINCT TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS AS c
WHERE COLUMN_NAME LIKE '%B%'
AND COLUMN_NAME NOT LIKE '%A%'
Updated to mutually exclude double matches.
Removed inner join on TABLE schema
One method just uses aggregation and having:
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROPU BY TABLE_NAME
HAVING SUM(CASE WHEN COLUMN_NAME LIKE '%A%' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN COLUMN_NAME LIKE '%B%' THEN 1 ELSE 0 END) > 0;
If necessary, you can use join back to INFORMATION_SCHEMA.COLUMNS to get the column names -- although that is not your actual question.

What is the T-SQL syntax to exclude a duplicate column in the output when joining 2 tables?

I am using SQL Server 2014 and I have the following T-SQL query which joins 2 tables:
SELECT a.*, b.* FROM TEMP a
INNER JOIN Extras b ON b.ResaID = a.ResaID
I would like to pull ALL the columns from TEMP and all the columns from "Extras" with the exception of the ResaID column as it is already included in a.* in the above query. Basically, I want to pull a.* + b.* (excluding b.ResaID).
I know I can write the query in the form:
Select a.*, b.column2, b.column3,...
but since b.* has got around 40 columns, is there a way to write the query in a more simplified way to exclude b.ResaID, rather than specify each of the columns in the "Extras" table?
Unfortunately, there is no such syntax. You could either use asterisks (*) and just ignore the duplicated column in your code, or explicitly list the columns you need.
You should create a view and select the columns you need from that view. Here is a script that will generate that view for you:
DECLARE #table1 nvarchar(20) = 'temp'
DECLARE #table1key nvarchar(20) = 'ResaID'
DECLARE #table2 nvarchar(20) = 'Extras'
DECLARE #table2key nvarchar(20) = 'ResaID'
DECLARE #viewname varchar(20) = 'v_myview'
DECLARE #sql varchar(max) = ''
SELECT #sql += '], a.[' + column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table1
SELECT #sql += '], b.[' + column_name
FROM
(
SELECT column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table2
EXCEPT
SELECT column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table1
) x
SELECT
#sql = 'CREATE view ' +#viewname+ ' as SELECT '
+ STUFF(#sql, 1, 3, '') + '] FROM ['
+#table1+ '] a JOIN ['+ #table2
+'] b ON ' + 'a.' + #table1key + '=b.' + #table2key
EXEC(#sql)
You can simply solve this using a dynamic sql query.
DECLARE #V_SQL AS NVARCHAR(2000)='' --variable to store dynamic query
,#V_TAB1 AS NVARCHAR(200)='TEMP' --First Table
,#V_TAB2 AS NVARCHAR(200)='Extras' --Second Table
,#V_CONDITION AS NVARCHAR(2000)='A.ResaID = B.ResaID' --Conditions
SELECT #V_SQL = STUFF(
( SELECT ', '+TCOL_NAME
FROM
( SELECT 'A.'+S.NAME AS TCOL_NAME
FROM SYSCOLUMNS AS S
WHERE OBJECT_NAME(ID) = #V_TAB1
UNION ALL
SELECT 'B.'+S.NAME
FROM SYSCOLUMNS AS S
WHERE OBJECT_NAME(ID) = #V_TAB2
AND S.NAME NOT IN (SELECT S.NAME
FROM SYSCOLUMNS AS S
WHERE OBJECT_NAME(ID) = #V_TAB1)
) D
FOR XML PATH('')
),1,2,'')
EXECUTE ('SELECT '+#V_SQL+'
FROM '+#V_TAB1+' AS A
INNER JOIN '+#V_TAB2+' AS B ON '+#V_CONDITION+' ')

Resources