SQL Server : relationship between columns in different tables - sql-server

I would like to find a way to understand if there is a relationship between two columns present in two different tables.
For example in the table [Sales].[SalesOrderHeader], I have a column SalesOrderID and in another table [Person].[EmailAddress], there is BusinessEntityID.
How can I check to see if there is a table that creates a relationship between these 2 columns? Or how can I be sure that there is not a relationship between these 2 columns?

INFORMATION_SCHEMA is what you are looking for. You can see whether or not a given column is used in a constraint by executing
SELECT * FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE WHERE TABLE_NAME = 'Person' AND COLUMN_NAME = 'BusinessEntityID'
You will have to do some additional work to focus on your specific solution, but this is where to start.

You could do one the following to find the tables that reference [Sales].[SalesOrderHeader]:
EXEC sp_fkeys #pktable_name = N'SalesOrderHeader',#pktable_owner = N'Sales';

I apologize before hand for what follows:
create table #rels (rel_name varchar(max), matches int) declare #sql varchar(max) = '' select #sql+= char(10) + 'insert into #rels select ''' tbla + '.' + col_a + '.' + tbl_b + '.' col_b ''' colrel, count(*) from ' + tbl_a + ' join ' + tbl_b + ' on cast(' + col_a + ' as varchar(max)) = cast(' + col_b + ' as varchar(max)) from ( select a.column_name col_a, object_name(a.object_id) tbl_a, b.column_name col_b, object_name(b.object_id) tbl_b from sys.columns a cross apply sys.columns b where a.column_name <> b.column_name where a.system_type_id = b.system_type_id ) cols exec (#sql) select * from #rels where matches > 0 order by matches desc drop table #rels

Related

SQL Server : dynamically select column based on another select column

I would like to write a query that returns two columns. The first column would be all of the column names of an existing table (ImportTable) sorted alphabetically, and the second column would be a sample row from that table (ImportTable) showing a potential value for one of those columns.
This is the pseudo code I have so far:
select
c.name as 'Column Name',
(select top(1) ImportTable."c.name"
from Database.dbo.[ImportTable] ImportTable) as 'Sample Column Value'
from
Database.sys.columns c
inner join
Database.sys.objects o on o.object_id = c.object_id
where
o.name = 'ImportTable'
order by
c.name asc
I don't know how to dynamically select the column ImportTable."c.name" based on the value of the column name. I'm not even sure what to search for that.
You need to create a dynamic unpivot statement and execute it.
Try not to get confused about what is the static and what the dynamic parts.
DECLARE #sql nvarchar(max) = N'
SELECT v.*
FROM (
SELECT TOP (1) *
FROM ' + QUOTENAME(#tablename) + N'
ORDER BY CHECKSUM(NEWID())
) t
CROSS APPLY (VALUES
' +
(
SELECT STRING_AGG(CAST(
N'(' + QUOTENAME(c.name, '''') + N', CAST(t.' + QUOTENAME(c.name) + N' AS sql_variant))'
AS nvarchar(max)), N',
') WITHIN GROUP (ORDER BY c.name ASC)
FROM sys.columns c
WHERE c.object_id = OBJECT_ID(#tablename)
) +
N'
) AS v(columnName, columnValue);
';
PRINT #sql; -- for testing
EXEC sp_executesql #sql;
If you are unfortunate enough to still be on a version not supporting STRING_AGG, you can use FOR XML instead for that part:
-------
CROSS APPLY (VALUES
' +
STUFF(
(
SELECT N',
(' + QUOTENAME(c.name, '''') + N', CAST(t.' + QUOTENAME(c.name) + N' AS sql_variant))'
FROM sys.columns c
WHERE c.object_id = OBJECT_ID(#tablename)
ORDER BY c.name ASC
FOR XML PATH(''), TYPE
).value('text()[1]', 'nvarchar(max)'), 1, 2, '') +
N'
) AS v(columnName, columnValue);
';
--------
With a little dynamic SQL, you can get this done with just a series of UNION selects. This doesn't require string_agg or any recent SQL feature.
declare #table sysname = N'schema.YourTableName';
-- Begin with a CTE to select 1 row at random:
declare #stmt nvarchar(max) = N'with cte as (select top(1)* from ' + #table + ' order by newid())';
-- Create string of UNION selects:
select #stmt = #stmt
+ 'select '''
+ name
+ ''' as ColumnName, cast('
+ name
+ ' as varchar(max)) as SampleValue from cte union '
from sys.columns
where object_id = object_id(#table)
-- Remove the trailing "union"
set #stmt = left(#stmt,len(#stmt)-5);
-- Add the ORDER
set #stmt = #stmt + ' order by ColumnName';
-- Execute the query:
exec sp_executesql #stmt = #stmt;

Summary of dynamically added columns for each row

I have a table with many columns (dynamically generated columns) and a rows with USER_KEY (which has type INT).
Type of dynamic added columns is DECIMAL(15,2).
My table looks like this:
What I would like to do is get summary for each user for all columns in that row. Since there are so many of them dynamically generated, I can not hard type this. How to do this?
I also have variable #COLDEPARTMENTS, where all those dynamic columns are separated by comma, like this:
[120000003],[120000002],[140000001],[120000005],[120000021],[120000025]
I assume you are using temp tables
select *
from tempdb.INFORMATION_SCHEMA.COLUMNS
where table_name like '#MyTempTable%'
Please refer How to retrieve field names from temporary table (SQL Server 2008)
You can use the below script to get all the auto-generated columns as single row with comma separated.
Declare #tmp varchar(250)
SET #tmp = ''
select #tmp = #tmp + Column_Name + ', ' from [AdventureWorksDW2014].INFORMATION_SCHEMA.COLUMNS
where table_name like '%FactInternetSales%'
select SUBSTRING(#tmp, 0, LEN(#tmp)) as Columns
You can also store the result in variable and use that with your original table.
If you are trying to identify which columns have values and you don't want to type out the column names. Then the following should do what you want.
SELECT 'SELECT user_key, '+
+ cols.ColumnList + CHAR(10) +
+ ' FROM ' + QUOTENAME(SCHEMA_NAME(t.schema_id)) + '.' + QUOTENAME(t.name) + CHAR(10)
FROM sys.tables t
CROSS APPLY (SELECT DISTINCT STUFF(
( SELECT CHAR(10) + CHAR(9) + ', '
+ QUOTENAME(c.name) + ' = CASE WHEN ' + QUOTENAME(c.name) + ' IS NULL THEN 1 ELSE 0 END'
FROM sys.columns c
WHERE c.object_id = t.object_id
AND c.name != 'user_id'
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)'),1,3,'') AS ColumnList
)cols
WHERE t.name = '{TABLE NAME}'

What is the T-SQL syntax to exclude a duplicate column in the output when joining 2 tables?

I am using SQL Server 2014 and I have the following T-SQL query which joins 2 tables:
SELECT a.*, b.* FROM TEMP a
INNER JOIN Extras b ON b.ResaID = a.ResaID
I would like to pull ALL the columns from TEMP and all the columns from "Extras" with the exception of the ResaID column as it is already included in a.* in the above query. Basically, I want to pull a.* + b.* (excluding b.ResaID).
I know I can write the query in the form:
Select a.*, b.column2, b.column3,...
but since b.* has got around 40 columns, is there a way to write the query in a more simplified way to exclude b.ResaID, rather than specify each of the columns in the "Extras" table?
Unfortunately, there is no such syntax. You could either use asterisks (*) and just ignore the duplicated column in your code, or explicitly list the columns you need.
You should create a view and select the columns you need from that view. Here is a script that will generate that view for you:
DECLARE #table1 nvarchar(20) = 'temp'
DECLARE #table1key nvarchar(20) = 'ResaID'
DECLARE #table2 nvarchar(20) = 'Extras'
DECLARE #table2key nvarchar(20) = 'ResaID'
DECLARE #viewname varchar(20) = 'v_myview'
DECLARE #sql varchar(max) = ''
SELECT #sql += '], a.[' + column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table1
SELECT #sql += '], b.[' + column_name
FROM
(
SELECT column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table2
EXCEPT
SELECT column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table1
) x
SELECT
#sql = 'CREATE view ' +#viewname+ ' as SELECT '
+ STUFF(#sql, 1, 3, '') + '] FROM ['
+#table1+ '] a JOIN ['+ #table2
+'] b ON ' + 'a.' + #table1key + '=b.' + #table2key
EXEC(#sql)
You can simply solve this using a dynamic sql query.
DECLARE #V_SQL AS NVARCHAR(2000)='' --variable to store dynamic query
,#V_TAB1 AS NVARCHAR(200)='TEMP' --First Table
,#V_TAB2 AS NVARCHAR(200)='Extras' --Second Table
,#V_CONDITION AS NVARCHAR(2000)='A.ResaID = B.ResaID' --Conditions
SELECT #V_SQL = STUFF(
( SELECT ', '+TCOL_NAME
FROM
( SELECT 'A.'+S.NAME AS TCOL_NAME
FROM SYSCOLUMNS AS S
WHERE OBJECT_NAME(ID) = #V_TAB1
UNION ALL
SELECT 'B.'+S.NAME
FROM SYSCOLUMNS AS S
WHERE OBJECT_NAME(ID) = #V_TAB2
AND S.NAME NOT IN (SELECT S.NAME
FROM SYSCOLUMNS AS S
WHERE OBJECT_NAME(ID) = #V_TAB1)
) D
FOR XML PATH('')
),1,2,'')
EXECUTE ('SELECT '+#V_SQL+'
FROM '+#V_TAB1+' AS A
INNER JOIN '+#V_TAB2+' AS B ON '+#V_CONDITION+' ')

SQL Server - find all tables in database with unique ID value

I want to find out all the tables in the database (in SQLServer) that has the column IDName with particular value 'SAM', like so IDName='SAM'. So my initial apporach was create a table with all the tables that have the column "IDName" (since not all the tables in the database has this column. Then i was thinking of going through each table to see which tables match the IDName='SAM' - This is where i'm stuck. I'm pretty sure there is a lot faster way of doing this too but im not too familiar with database query coding. Anything will help Thanks!
select * into tmp from
(
SELECT SO.NAME AS TableName, SC.NAME AS ColumnName
FROM dbo.sysobjects SO INNER JOIN dbo.syscolumns SC ON SO.id = SC.id
WHERE sc.name = 'IDName' and SO.type = 'U'
) tablelist
So if I go Select * from tmp, I get the list of tables that have the column "IDName". Now I have to go through each one of that list and see if they have "IDName = 'Sam'" if they do add it to the output table. In the end I want to see all the names of the tables from the database that has the IDName 'Sam'.
DECLARE #sql AS varchar(max) = '';
DECLARE #ColumnName AS varchar(100) = 'IDName'
DECLARE #ResultQuery AS varchar(max) = 'SELECT ''#TableName'' AS TableName ' +
' ,#ColumnName ' +
'FROM #TableName ' +
'WHERE #ColumnName = ''SAM''';
SET #ResultQuery = REPLACE(#ResultQuery, '#ColumnName', QUOTENAME(#ColumnName));
WITH AllTables AS (
SELECT SCHEMA_NAME(Tables.schema_id) AS SchemaName
,Tables.name AS TableName
,Columns.name AS ColumnName
FROM sys.tables AS Tables
INNER JOIN sys.columns AS Columns
ON Tables.object_id = Columns.object_id
WHERE Columns.name = #ColumnName
)
SELECT #sql = #sql + ' UNION ALL ' +
REPLACE(#ResultQuery, '#TableName', QUOTENAME(TableName)) + CHAR(13)
FROM AllTables
SET #sql = STUFF(#sql, 1, LEN(' UNION ALL'), '');
--PRINT #sql;
SET #sql =
'WITH AllTables AS ( ' +
#sql +
') ' +
'SELECT DISTINCT TableName ' +
'FROM AllTables ';
EXEC (#sql)

Get a list of all columns that do not have only NULL values in SQL Server

I NEVER do complicated stuff in SQL - until now...
I have a database with over 2000 tables, each table has about 200 columns.
I need to get a list of all the columns in one of those tables that are populated at least 1 time.
I can get a list of all the columns like this:
SELECT [name] AS [Column name]
FROM syscolumns with (nolock)
WHERE id = (SELECT id FROM sysobjects where name like 'DOCSDB_TDCCINS')
But I need only the columns that are populated 1 or more times.
Any help would be appreciated.
Here is how I would do it, first run this:
SELECT 'SELECT '''+syscolumns.name+''' FROM '+sysobjects.name+' HAVING COUNT('+syscolumns.name+') > 0'
FROM syscolumns with (nolock)
JOIN sysobjects with (nolock) ON syscolumns.id = sysobjects.id
WHERE syscolumns.id = (SELECT id FROM sysobjects where name like 'Email')
Copy all the select statements and run them.
This will give you a list of the column names without nulls.
(nb I did not test because I don't have an SQL server available right now, so I could have a typo)
It may be also be useful to count the non-null instances, obviously 0 or not 0 was your initial question, and counting the instances versus exists not/exists will be slower.
select 'union select ''' + Column_Name + ''',count(*)'
+ ' from ' + table_name
+ ' where ' + column_name + ' is not null'
from
(
select * from information_schema.columns with (nolock)
where Is_Nullable = 'YES'
AND Table_Name like 'DOCSDB_TDCCINS'
) DD
Then remove the superfluous leading 'union' and run the query
A different idea is to create a dynamic unpivot for every table.
Declare #q NVarchar(MAX) = NULL
;With D AS (
SELECT TABLE_SCHEMA
, TABLE_NAME
, STUFF((SELECT ', ' + QUOTENAME(ci.COLUMN_NAME)
FROM INFORMATION_SCHEMA.COLUMNS ci
WHERE (ci.TABLE_NAME = c.TABLE_NAME)
AND (ci.TABLE_SCHEMA = c.TABLE_SCHEMA)
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)')
,1,2,'') AS _Cols
, STUFF((SELECT ', Count(' + QUOTENAME(ci.COLUMN_NAME) + ') '
+ QUOTENAME(ci.COLUMN_NAME)
FROM INFORMATION_SCHEMA.COLUMNS ci
WHERE (ci.TABLE_NAME = c.TABLE_NAME)
AND (ci.TABLE_SCHEMA = c.TABLE_SCHEMA)
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)')
,1,2,'') AS _ColsCount
FROM INFORMATION_SCHEMA.COLUMNS c
GROUP BY TABLE_SCHEMA, TABLE_NAME
)
SELECT #q = COALESCE(#q + ' UNION ALL ', '') + '
SELECT ''' + TABLE_SCHEMA + ''' _Schema, ''' + TABLE_NAME + ''' _Table, _Column
FROM (SELECT ' + _ColsCount + ' from ' + TABLE_SCHEMA + '.' + TABLE_NAME + ') x
UNPIVOT
(_Count FOR _Column IN (' + _Cols + ')) u
WHERE _Count > 0'
FROM D
exec sp_executesql #q
In the CTE _Cols returns the comma separated quoted name of the columns of the table, while _ColsCount returns the same list with the COUNT function, for example for a table of mine a row of D is
TABLE_SCHEMA | TABLE_NAME | _Cols | _ColsCount
------------- ----------------- ------------------------------ -----------------------------------------------------------------------------
dbo | AnnualInterests | [Product_ID], [Rate], [Term] | Count([Product_ID]) [Product_ID], Count([Rate]) [Rate], Count([Term]) [Term]
while the main query trasform this line in the UNPIVOT to return the columns in rows
SELECT 'dbo' _Schema, 'AnnualInterests' _Table, _Column
FROM (SELECT Count([Product_ID]) [Product_ID], Count([Term]) [Term]
, Count([Rate]) [Rate] from dbo.AnnualInterests) x
UNPIVOT
(_Count FOR _Column IN ([Product_ID], [Term], [Rate])
WHERE _Count > 0
using the string variable concatenation and sp_executesql to run the string complete the script.
Hope you can achieve this by a simple alteration on your code like
SELECT [name] AS [Column name]
FROM syscolumns with (nolock)
WHERE id = (SELECT id FROM sysobjects where name like 'DOCSDB_TDCCINS')
and (select count(*) from DOCSDB_TDCCINS)>0

Resources