How do I compare two columns from different tables in different databases? - sql-server

Say I have Table_1 in Database_1 with 25 Columns
and say I have Table_2 in Database_2 with 19 Columns
I want to compare the columns in Table_1 and Table_2 and output Columns that exist in Table_1 but not in Table_2
I tried
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME='Table_1'
EXCEPT
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME='Table_2'
The problem is: If I am in Database_1 it only finds variables in Table_1 and return empty list for Table_2, if I am in Database_2 it only finds variables in Table_2 and returns empty list for Table_1. If say I am in Master, it returns empty list for both Table_1 and Table_2. How do I properly locate each table and their variables from one database?

You can access any database object from any database context by fully qualifying the object name in the form of database.schema.object.
Using SQL Server you are better off using the sys schema, which (if performance matters) is better than using the information_schema schema.
So you can do
select name
from database_1.sys.columns
where object_id=object_id(N'database_1.sys.table_1')
except
select name
from database_2.sys.columns
where object_id=object_id(N'database_2.sys.table_2')

Related

Compare 3 SQL tables from 3 different databases in Microsoft SQL Server

I need to compare three tables from three different databases in SQL Server. Is this even possible?
I have 3 different data bases: prod, test1, test2. I have a tables with definitions called DEFINITIONS in each database. There are different values in each of the table depending on the database. My job is to compare all of these 3 tables and point the differences.
I was thinking about using the EXCEPT or INTERSECT operators to show the differences or similarities between these 3 tables but I cannot find any information how to merge these 3 databases.
Thanks for any tips!
You can do it by using except / intersect...
Main idea:
-- This creates rows that exist in db1 but not in db2
select * from db1.dbo.table1 t
except
select * from db2.dbo.table2 t
union
-- This creates rows that exist in db2 but not in db1
select * from db2.dbo.table2 t
except
select * from db1.dbo.table1 t
-- Etc...
To get the simularities you change EXCEPT to INTERSECT
The problem with this solution is that one column difference will generate two missing rows, one from db1 and one from db2.
This can be solved by using FULL OUTER JOIN ON primary keys from both tables and just displays row values.
Something like:
select CASE WHEN t.ID IS NULL THEN 'Missing in 1' WHEN t2.ID IS NULL THEN 'Missing in 2' ELSE 'Both exists'
, t.*, t2.*
from db1.dbo.table1 t
FULL OUTER JOIN db2.dbo.table2 t2
ON t2.ID = t.ID
Then you just need to format data for your usage.
A couple of caveats of these approaches:
All tables must have same number / type of columns for EXCEPT SELECT * to work. Otherwise you need to choose which columns to match
Collations of varchar fields should match between the two database tables, otherwise EXCEPT / INTERSECT will crash. You can solve it by "re-collating" the columns by using: SELECT ..., somevarcharcolumn COLLATE DATABASE_DEFAULT
There is also tools for this in Visual Studio and probably other clients (schema and data compare) etc.
Excel has some nice functions for this too, if you load data with matching rows from each table, you can color the diffing fields by using VLOOKUP etc

How to see the data types of all columns in SQL Sever Management System

I could not find answer to this question, despite it being very basic. How do I know whats the data type of all columns in SQL Server management System?
Col1 Col2 Col3 and so on
I wish to know the datatypes of each column in say Table1 where Table1 is the name of my table .
There are couple of options to see the data types of columns of the desired table -
Option 1
sp_help <tableName> e.g. sp_help Table1
Option 2
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'Table1'
Option 3
Expand the Tables
Expand the desired table
Expand the columns
There are many several way to do this, one of them is to use schema :
select *
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = 'Table1' or
COLUMN_Name = 'col1';

How to get a list of all tables in two different databases

I'm trying to create a little SQL script (in SQL Server Management Studio) to get a list of all tables in two different databases. The goal is to find out which tables exist in both databases and which ones only exist in one of them.
I have found various scripts on SO to list all the tables of one database, but so far I wasn't able to get a list of tables of multiple databases.
So: is there a way to query SQL Server for all tables in a specific database, e.g. SELECT * FROM ... WHERE databaseName='first_db' so that I can join this with the result for another database?
SELECT * FROM database1.INFORMATION_SCHEMA.TABLES
UNION ALL
SELECT * FROM database2.INFORMATION_SCHEMA.TABLES
UPDATE
In order to compare the two lists, you can use FULL OUTER JOIN, which will show you the tables that are present in both databases as well as those that are only present in one of them:
SELECT *
FROM database1.INFORMATION_SCHEMA.TABLES db1
FULL JOIN database2.INFORMATION_SCHEMA.TABLES db2
ON db1.TABLE_NAME = db2.TABLE_NAME
ORDER BY COALESCE(db1.TABLE_NAME, db2.TABLE_NAME)
You can also add WHERE db1.TABLE_NAME IS NULL OR db2.TABLE_NAME IS NULL to see only the differences between the databases.
As far as I know, you can only query tables for the active database. But you could store them in a temporary table, and join the result:
use db1
insert #TableList select (...) from sys.tables
use db2
insert #TableList2 select (...) from sys.tables
select * from #TableList tl1 join Tablelist2 tl2 on ...
Just for completeness, this is the query I finally used (based on Andriy M's answer):
SELECT * FROM DB1.INFORMATION_SCHEMA.Tables db1
LEFT OUTER JOIN DB2.INFORMATION_SCHEMA.Tables db2
ON db1.TABLE_NAME = db2.TABLE_NAME
ORDER BY db1.TABLE_NAME
To find out which tables exist in db2, but not in db1, replace the LEFT OUTER JOIN with a RIGHT OUTER JOIN.

How to SELECT * but without "Column names must be unique in each view"

I need to encapsulate a set of tables JOINs that we freqently make use of on a vendor's database server. We reuse the same JOIN logic in many places in extracts etc. and it seemed a VIEW would allow the JOINs to be defined and maintained in one place.
CREATE VIEW MasterView
AS
SELECT *
FROM entity_1 e1
INNER JOIN entity_2 e2 ON e2.parent_id = entity_1.id
INNER JOIN entity_3 e3 ON e3.parent_id = entity_2.id
/* other joins including business logic */
etc.
The trouble is that the vendor makes regular changes to the DB (column additions, name changes) and I want that to be reflected in the "MasterView" automatically.
SELECT * would allow this, but the underlying tables all have ID columns so I get the "Column names in each view must be unique" error.
I specifically want to avoid listing the column names from the tables because a) it requires frequent maintenance b) there are several hundred columns per table.
Is there any way to achieve the dynamism of SELECT * but effectively exclude certain columns (i.e. the ID ones)
Thanks
I specifically want to avoid listing the column names from the tables because a) it requires frequent maintenance b) there are several hundred columns per table.
In this case, you can't avoid it. You must specify column names and for those columns with duplicate names use an alias. Code generation can help with these many columns.
SELECT * is bad practice regardless - if someone adds a 2GB binary column to one of these tables and populates it, do you really want it to be returned?
One simple method to generate the columns you want is
select column_name+',' from information_schema.columns
where table_name='tt'
and column_name not in('ID')
As well as Oded's answer (100% agree with)...
If someone changes the underlying tables, you need view maintenance anyway (with sp_refreshview). The column changes will not appear in the view automatically. See "select * from table" vs "select colA, colB, etc. from table" interesting behaviour in SQL Server 2005
So your "reflected in the "MasterView" automatically requirement can't be satisfied anyway
If you want to ensure the view is up to date, use WITH SCHEMABINDING which will prevent changes to the underlying tables (until removed or dropped). Then make column changes, then re-apply the view
I had the same issue, see example below:
ALTER VIEW Summary AS
SELECT * FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.Id = t2.Id
and I encountered that error you mentioned, the easiest solution is using the alias before * like this:
SELECT t1.* FROM Table1 AS t1
INNER JOIN Table2 AS t2 ON t1.Id = t2.Id
You shouldn't see that error anymore.
I had gone with this in the end, building off of Madhivanan's suggestion. It's similar to what t-clausen.dk later suggested (thanks for your efforts) though I find the xml path style more elegant than cursors / rank partitions.
The following recreates the MasterView definition when run. All columns in the underlying tables are prepended with the table name, so I can include two similarly named columns in the view by default. This alone solves my original problem, but I also included the "WHERE column_name NOT IN" clause to specifically exclude certain columns that will never be used in the MasterView.
create procedure Utility_RefreshMasterView
as
begin
declare #entity_columns varchar(max)
declare #drop_view_sql varchar(max)
declare #alter_view_definition_sql varchar(max)
/* create comma separated string of columns from underlying tables aliased to avoid name collisions */
select #entity_columns = stuff((
select ','+table_name+'.['+column_name+'] AS ['+table_name+'_'+column_name+']'
from information_schema.columns
where table_name IN ('entity_1', 'entity_2')
and column_name not in ('column to exclude 1', 'column to exclude 2')
for xml path('')), 1, 1, '')
set #drop_view_sql = 'if exists (select * from sys.views where object_id = object_id(N''[dbo].[MasterView]'')) drop view MasterView'
set #alter_view_definition_sql =
'create view MasterView as select ' + #entity_columns + '
from entity_1
inner join entity_2 on entity_2 .id = entity_1.id
/* other joins follow */'
exec (#drop_view_sql)
exec (#alter_view_definition_sql)
end
If you have a Select * and then you are using the JOIN, the result might include columns with the same name and that cannot be possible in a view.If you run the query by itself, works fine but not when creating the View.
For example:
**Table A**
ID, CatalogName, CatalogDescription
**Table B**
ID, CatalogName, CatalogDescription
**After the JOIN query**
ID, CatalogName, CatalogDescription, ID, CatalogName, CatalogDescription
That's not possible in a View.
Specify a unique name for each column in the view. Using just * is not a very good practice.

sql server 2005 - select records from tbl A contained WITHIN a text field of tbl B

I'm trying to work out a SQL Select in MS SQL 2005, to do the following:
TABLE_A contains a list of keywords... asparagus, beetroot, beans, egg plant etc (x200).
TABLE_B contains a record with some long free text (approx 4000 chars)...
I know what record within TABLE_B I am selecting (byID).
However I need to get a shortlist of records from TABLE_A that are contained WITHIN the text of the record in TABLE_B.
I'm wondering if SQLs CONTAINS function is uselful... but maybe not.
This needs to be a super quick query.
Cheers
It will never be super quick because of the LIKE and wildcard at each end. You can not index it and there are no whizzy tricks. However, because you have already filtered TableB then it should be acceptable. If you had a million rows in tableB, you could go for coffee while it ran
SELECT
A.KeyWordColumn
FROM
TableA A
JOIN
TableB B ON B.BigTextColumn LIKE '%' + A.KeyWordColumn+ '%'
WHERE
B.ByID = #ID --or constant etc
CONTAINS can be used if you have full text indexing: but not for a normal SQL query
I would try this
select keyword from table_a, table_b
where table_b.text like '%' + keyword + '%'
and table_b.Id = '111'

Resources