We have a database setup that consists of two parts: a static structure, and dynamic additions. For each database, the dynamic can be different, and sometimes we don't have data for all the dynamic fields. Rigt now, we check for empties by looking at the total count of records in the entire table, but we want to move to a more refined method of checking for empties if possible. Is it possible to quickly check through several hundred fields and see which ones are empty and which ones are populated?
For searching for any rows that have NULLS in any column you can do this, first create this proc which is based on the code here Search all columns in all the tables in a database for a specific value
CREATE PROCEDURE FindMyData_StringNull
#DataToFind NVARCHAR(4000),
#ExactMatch BIT = 0
AS
SET NOCOUNT ON
DECLARE #Temp TABLE(RowId INT IDENTITY(1,1), SchemaName sysname, TableName sysname, ColumnName SysName, DataType VARCHAR(100), DataFound BIT)
INSERT INTO #Temp(TableName,SchemaName, ColumnName, DataType)
SELECT C.Table_Name,C.TABLE_SCHEMA, C.Column_Name, C.Data_Type
FROM Information_Schema.Columns AS C
INNER Join Information_Schema.Tables AS T
ON C.Table_Name = T.Table_Name
AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE Table_Type = 'Base Table'
DECLARE #i INT
DECLARE #MAX INT
DECLARE #TableName sysname
DECLARE #ColumnName sysname
DECLARE #SchemaName sysname
DECLARE #SQL NVARCHAR(4000)
DECLARE #PARAMETERS NVARCHAR(4000)
DECLARE #DataExists BIT
DECLARE #SQLTemplate NVARCHAR(4000)
SELECT #SQLTemplate = 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
IS NULL
)
Set #DataExists = 1
Else
Set #DataExists = 0'
,
#PARAMETERS = '#DataExists Bit OUTPUT',
#i = 1
SELECT #i = 1, #MAX = MAX(RowId)
FROM #Temp
WHILE #i <= #MAX
BEGIN
SELECT #SQL = REPLACE(REPLACE(#SQLTemplate, 'ReplaceTableName', QUOTENAME(SchemaName) + '.' + QUOTENAME(TableName)), 'ReplaceColumnName', ColumnName)
FROM #Temp
WHERE RowId = #i
PRINT #SQL
EXEC SP_EXECUTESQL #SQL, #PARAMETERS, #DataExists = #DataExists OUTPUT
IF #DataExists =1
UPDATE #Temp SET DataFound = 1 WHERE RowId = #i
SET #i = #i + 1
END
SELECT SchemaName,TableName, ColumnName
FROM #Temp
WHERE DataFound = 1
Call it like this
FindMyData_StringNull NULL,1
Assuming that you are just checking for whether or not there are any non-NULL values in the column, using EXISTS should generally be faster than getting a COUNT(*). The COUNT needs to scan the whole table to come up with the correct number. EXISTS just needs to find one row that satisfies the condition before it stops looking.
If the whole column is NULL then the time will be about the same, but in all of those cases where you have values it could be substantially shorter.
From Search all columns in all the tables in a database for a specific value
first create this function
CREATE PROCEDURE FindMyData_String
#DataToFind NVARCHAR(4000),
#ExactMatch BIT = 0
AS
SET NOCOUNT ON
DECLARE #Temp TABLE(RowId INT IDENTITY(1,1), SchemaName sysname, TableName sysname, ColumnName SysName, DataType VARCHAR(100), DataFound BIT)
INSERT INTO #Temp(TableName,SchemaName, ColumnName, DataType)
SELECT C.Table_Name,C.TABLE_SCHEMA, C.Column_Name, C.Data_Type
FROM Information_Schema.Columns AS C
INNER Join Information_Schema.Tables AS T
ON C.Table_Name = T.Table_Name
AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE Table_Type = 'Base Table'
And Data_Type In ('ntext','text','nvarchar','nchar','varchar','char')
DECLARE #i INT
DECLARE #MAX INT
DECLARE #TableName sysname
DECLARE #ColumnName sysname
DECLARE #SchemaName sysname
DECLARE #SQL NVARCHAR(4000)
DECLARE #PARAMETERS NVARCHAR(4000)
DECLARE #DataExists BIT
DECLARE #SQLTemplate NVARCHAR(4000)
SELECT #SQLTemplate = CASE WHEN #ExactMatch = 1
THEN 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
= ''' + #DataToFind + '''
)
Set #DataExists = 1
Else
Set #DataExists = 0'
ELSE 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
Like ''%' + #DataToFind + '%''
)
Set #DataExists = 1
Else
Set #DataExists = 0'
END,
#PARAMETERS = '#DataExists Bit OUTPUT',
#i = 1
SELECT #i = 1, #MAX = MAX(RowId)
FROM #Temp
WHILE #i <= #MAX
BEGIN
SELECT #SQL = REPLACE(REPLACE(#SQLTemplate, 'ReplaceTableName', QUOTENAME(SchemaName) + '.' + QUOTENAME(TableName)), 'ReplaceColumnName', ColumnName)
FROM #Temp
WHERE RowId = #i
PRINT #SQL
EXEC SP_EXECUTESQL #SQL, #PARAMETERS, #DataExists = #DataExists OUTPUT
IF #DataExists =1
UPDATE #Temp SET DataFound = 1 WHERE RowId = #i
SET #i = #i + 1
END
SELECT SchemaName,TableName, ColumnName
FROM #Temp
WHERE DataFound = 1
Now call it like this for rows with empty strings in any string type columns
exec FindMyData_String '',1
it will give you an output with column name, table name and schema name
Just keep in mind that it will search all tables
I would think the simplest solution is to use the CHECKSUM function. First you would want to determine the checksum on an empty row and then compare that to the other rows.
Select Checksum(*)
From Table
The catch with using * here is that it will include the PK. You would likely have to specify the individual columns excluding the PK to get an accurate read. So something like:
Select Checksum(Col1, Col2, Col3)
From Table
Checksum Function.
Related
I'm using a while loop to iterate through some tables and carry out a replace on a field, to remove all the hyphens. All of the fields are varchar.
DECLARE #Table TABLE (TableName VARCHAR(max),Id int identity(1,1))
INSERT INTO #Table
Select distinct table_name From INFORMATION_SCHEMA.COLUMNS
DECLARE #max int
DECLARE #SQL VARCHAR(max)
DECLARE #TableName VARCHAR(max)
DECLARE #id int = 1
select #max = MAX(Id) from #Table
WHILE (#id <= #max)
BEGIN
SELECT #TableName = TableName FROM #Table WHERE Id = #id
SET #SQL = 'update '+ #TableName +' set colA = replace(colA,'-','');'
EXEC(#SQL)
SET #id = #id +1
END
the error I receive is:
The data types varchar(max) and varchar are incompatible in the subtract operator.
I've tried changing the varchar variables to fixed lengths or all to max, but nothing seems to work.
You need to use two single quotes when creating a single quote inside of a string:
DECLARE #Table TABLE (TableName VARCHAR(max),Id int identity(1,1))
INSERT INTO #Table
Select distinct table_name From INFORMATION_SCHEMA.COLUMNS
DECLARE #max int
DECLARE #SQL VARCHAR(max)
DECLARE #TableName VARCHAR(max)
DECLARE #id int = 1
select #max = MAX(Id) from #Table
WHILE (#id <= #max)
BEGIN
SELECT #TableName = TableName FROM #Table WHERE Id = #id
SET #SQL = 'update '+ #TableName +' set colA = replace(colA,''-'','''');'
EXEC(#SQL)
SET #id = #id +1
END
I have 64 columns and I am trying to automate the loop process. The loop runs but it shows 0 affected rows. If I update the table, column by column, it works.
Any idea why its showing 0 affected rows and what can be done ?
update temp set col1 = 'C' where col1 IS Null; -- works (276 rows affected)--
declare #count as int;
declare #name as varchar(max);
set #count = 2;
while #count < (SELECT Count(*) FROM INFORMATION_SCHEMA.Columns where TABLE_NAME = 'temp')+1
Begin
Set #name = (select name from (select colorder, name from (SELECT *
FROM syscolumns WHERE id=OBJECT_ID('temp')) colnames) as cl where colorder = #count)
Print #name
update temp set #name = 'C' where #name IS Null;
SET #count = #count + 1;
END;
You need to use dynamic sql to update the different columns during runtime as below.
Note: I just added/modified the dynamic sql part.
declare #count as int;
declare #name as varchar(max)
declare #sql nvarchar (1000)
set #count = 2
while #count < (SELECT Count(*) FROM INFORMATION_SCHEMA.Columns where TABLE_NAME = 'temp')+1
Begin
Set #name = (select name from (select colorder, name from (SELECT *
FROM syscolumns WHERE id=OBJECT_ID('temp')) colnames) as cl where colorder = #count)
Print #name
set #sql = N'update temp set ' + #name + '= ''C'' where ' + #name + ' is null ';
exec sp_executesql #sql
SET #count = #count + 1
END;
I am using SQL Server 2012.
The first part of my query is already answered in this thread. But I also want a second column that will show the corresponding maximum value of that column in its corresponding table.
I have tried this approach: use a function that takes in table name and column name as parameter and return the max value. But it is illegal to use dynamic SQL from a function. Moreover, i cannot seem to call a function from within a SELECT query.
I have also tried using stored procedure, but i cannot figure out how to call it and use it. Please suggest alternative ways to achieve this.
I am new to SQL Server.
Thanks
I think the easiest solution would be stored procedure. As far as I know:
Dynamic SQL can't be placed in functions
Dynamic SQL can't be place in OPENROWSET
I addition, if you write such procedure:
Beware of names containing spaces, qoutes (SQL injection possible)
MAX(column) on non-Indexed columns would require full scan (can be very slow)
Table and column names can be duplicated (placed in differend schemas)
Id duplicates and performance is not a problem, take a look at the following snippet:
CREATE PROC FindMaxColumnValues
#type sysname = '%',
#table sysname = '%'
AS
DECLARE #result TABLE (TableName sysname, ColumnName sysname, MaxValue NVARCHAR(MAX))
DECLARE #tab sysname
DECLARE #col sysname
DECLARE cur CURSOR FOR
SELECT TABLE_NAME TableName, COLUMN_NAME [Column Name]
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE LIKE #type and TABLE_NAME LIKE #table
OPEN cur
FETCH NEXT FROM cur INTO #tab, #col
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #sql nvarchar(MAX) = 'SELECT '+QUOTENAME(#tab,'''')+' [TableName], '+QUOTENAME(#col, '''')+' [ColumnName], MAX('+QUOTENAME(#col)+') FROM '+QUOTENAME(#tab)
INSERT INTO #result EXEC(#sql)
FETCH NEXT FROM cur INTO #tab, #col
END
CLOSE cur
DEALLOCATE cur
SELECT * FROM #result
Samples:
--MAX of INT's
EXEC FindMaxColumnValues 'INT'
--MAX of INT's in tables matching 'TestTab%'
EXEC FindMaxColumnValues 'INT', 'TestTab%'
--MAX of ALL columns
EXEC FindMaxColumnValues
Results:
TableName ColumnName MaxValue
IdNameTest ID 2
TestTable ID 5
TestTable Number 3
TableName ColumnName MaxValue
TestTable ID 5
TestTable Number 3
TableName ColumnName MaxValue
UpdateHistory UpdateTime 2016-07-14 12:21:37.00
IdNameTest ID 2
IdNameTest Name T2
TestTable ID 5
TestTable Name F
TestTable Number 3
You can use the below SP and enhance it per your Need,
CRETE PROCEDURE Getmaxtablecolval
AS
BEGIN
CREATE TABLE #t
(
tablename VARCHAR(50),
columnname VARCHAR(50),
id INT,
counts INT
)
INSERT INTO #t
SELECT table_name [Table Name],
column_name [Column Name],
NULL,
NULL
FROM information_schema.columns
WHERE data_type = 'INT'
BEGIN TRAN
DECLARE #id INT
SET #id = 0
UPDATE #t
SET #id = id = #id + 1
COMMIT TRAN
DECLARE #RowCount INT
SET #RowCount = (SELECT Count(0)
FROM #t)
DECLARE #I INT
SET #I = 1
DECLARE #Counter INT
DECLARE #TName VARCHAR(50)
DECLARE #CName VARCHAR(50)
DECLARE #DynamicSQL AS VARCHAR(500)
WHILE ( #I <= #RowCount )
BEGIN
SELECT #TName = tablename
FROM #t
WHERE id = #I
SELECT #CName = columnname
FROM #t
WHERE id = #I
SET #DynamicSQL = 'Update #T Set Counts = '
+ '(Select ISNull(Max(' + #CName + '), 0) From '
+ #TName + ') Where Id = '
+ CONVERT(VARCHAR(10), #I)
--PRINT #DynamicSQL
EXEC (#DynamicSQL)
SET #I = #I + 1
END
SELECT *
FROM #t
END
go
Getmaxtablecolval
You can create a procedure out of this:
CREATE PROCEDURE GET_COLUMNS_WITH_MAX_VALUE
#COLUMN_TYPE NVARCHAR(50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- DUMMY VARIABLE TO COPY STRUCTURE TO TEMP
DECLARE #DUMMY TABLE
(
TABLE_NAME NVARCHAR(50),
COLUMN_NAME NVARCHAR(50),
MAX_VALUE NVARCHAR(MAX)
)
-- CREATE TEMP TABLE FOR DYNAMIC SQL
SELECT TOP 0 * INTO #TABLE FROM #DUMMY
INSERT INTO #TABLE
(TABLE_NAME, COLUMN_NAME)
SELECT TABLE_NAME, COLUMN_NAME
FROM information_schema.columns where data_type = #COLUMN_TYPE
DECLARE #TABLE_NAME VARCHAR(50) -- database name
DECLARE #COLUMN_NAME VARCHAR(256) -- path for backup files
DECLARE db_cursor CURSOR FOR
SELECT TABLE_NAME, COLUMN_NAME
FROM #TABLE
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO #TABLE_NAME, #COLUMN_NAME
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #SQL NVARCHAR(MAX) = 'UPDATE #TABLE SET MAX_VALUE = (SELECT MAX([' + #COLUMN_NAME + ']) FROM [' + #TABLE_NAME + ']) '
+ 'WHERE [COLUMN_NAME] = ''' + #COLUMN_NAME + ''' AND TABLE_NAME = ''' + #TABLE_NAME + '''';
PRINT #SQL
EXEC (#SQL)
FETCH NEXT FROM db_cursor INTO #TABLE_NAME, #COLUMN_NAME
END
CLOSE db_cursor
DEALLOCATE db_cursor
SELECT * FROM #TABLE
DROP TABLE #TABLE
END
GO
Usage:
EXEC GET_COLUMNS_WITH_MAX_VALUE 'INT'
Results:
TABLE1 ID 50
TABLE2 ID 100
TABLE3 CarID 20
TABLE4 StudentID 30
In T-SQL is it possible to give me a list of tables and columns that contain specified column value?
For example, I get a employee report generated out of a SQL Server database that shows "John" as name in one of the columns. Now I want to find out where "John" appears as a field value in any table/column in the database.
So in English my query should sound like;
Select table, column
from database A
where field = 'John'
Thanks
Try with this procedure
CREATE PROCEDURE FindMyData_String #DataToFind NVARCHAR(4000),
#ExactMatch BIT = 0
AS
SET NOCOUNT ON
DECLARE #Temp TABLE
(
RowId INT IDENTITY(1, 1),
SchemaName SYSNAME,
TableName SYSNAME,
ColumnName SYSNAME,
DataType VARCHAR(100),
DataFound BIT
)
INSERT INTO #Temp
(TableName,
SchemaName,
ColumnName,
DataType)
SELECT C.Table_Name,
C.TABLE_SCHEMA,
C.Column_Name,
C.Data_Type
FROM Information_Schema.Columns AS C
INNER JOIN Information_Schema.Tables AS T
ON C.Table_Name = T.Table_Name
AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE Table_Type = 'Base Table'
AND Data_Type IN ( 'ntext', 'text', 'nvarchar', 'nchar',
'varchar', 'char' )
DECLARE #i INT
DECLARE #MAX INT
DECLARE #TableName SYSNAME
DECLARE #ColumnName SYSNAME
DECLARE #SchemaName SYSNAME
DECLARE #SQL NVARCHAR(4000)
DECLARE #PARAMETERS NVARCHAR(4000)
DECLARE #DataExists BIT
DECLARE #SQLTemplate NVARCHAR(4000)
SELECT #SQLTemplate = CASE
WHEN #ExactMatch = 1 THEN 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
= '''
+ #DataToFind
+ '''
)
Set #DataExists = 1
Else
Set #DataExists = 0'
ELSE 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
Like ''%'
+ #DataToFind
+ '%''
)
Set #DataExists = 1
Else
Set #DataExists = 0'
END,
#PARAMETERS = '#DataExists Bit OUTPUT',
#i = 1
SELECT #i = 1,
#MAX = MAX(RowId)
FROM #Temp
WHILE #i <= #MAX
BEGIN
SELECT #SQL = REPLACE(REPLACE(#SQLTemplate, 'ReplaceTableName', QUOTENAME(SchemaName) + '.'
+ QUOTENAME(TableName)), 'ReplaceColumnName', ColumnName)
FROM #Temp
WHERE RowId = #i
PRINT #SQL
EXEC SP_EXECUTESQL
#SQL,
#PARAMETERS,
#DataExists = #DataExists OUTPUT
IF #DataExists = 1
UPDATE #Temp
SET DataFound = 1
WHERE RowId = #i
SET #i = #i + 1
END
SELECT SchemaName,
TableName,
ColumnName
FROM #Temp
WHERE DataFound = 1
GO
EXEC FindMyData_String
#DataToFind='Enter Your String'
How do I find out how many tables are on my instance of SQL Server? I can get it for a single schema using select count(*) from sysobjects where type = 'U'
(from how to count number of tables/views/index in my database)
You're using the word "schema", but I think you're really asking to count tables across all "databases".
declare #t table (
DBName sysname,
NumTables int
)
insert into #t
exec sp_MSforeachdb N'select ''?'', count(*)
from [?].dbo.sysobjects
where type = ''U'''
select DBName, NumTables
from #t
where DBName not in ('distribution','master','model','msdb','tempdb')
order by DBName
select SUM(NumTables) as TotalTables
from #t
where DBName not in ('distribution','master','model','msdb','tempdb')
An option without using the hidden, undocumented sp_MSforeachdb
declare #sql nvarchar(max)
select #sql = coalesce(#sql + ' + ', '') + REPLACE('
(select count(*)
from ::DB::.sys.objects
where is_ms_shipped = 0
and type_desc = ''USER_TABLE'')', '::DB::', QUOTENAME(name))
from master.sys.databases
where owner_sid != 0x01
select #sql = 'select ' + #sql
exec (#sql) -- returns a single count of all [user] tables in the instance
>
A note on performance. It is insignificant in the greater scheme of things, but with all things interesting, someone is bound to time it. Here is a comparison of the ms_foreachdb approach passing through a temp table (it internally uses a cursor) against the string-concat method.
-- all the variables that we will use
declare #i int -- loop variable
declare #sql nvarchar(max) -- statement var used for 1st approach
declare #t table (DBName sysname, NumTables int) -- table used for 2nd approach
-- init plan cache and buffers
dbcc freeproccache dbcc dropcleanbuffers
print convert(varchar(30), getdate(), 121)
set #i = 0 while #i < 5 begin
set #sql = null
select #sql = coalesce(#sql, '') + REPLACE('
select #c = #c + count(*)
from ::DB::.sys.objects
where is_ms_shipped = 0
and type_desc = ''USER_TABLE''', '::DB::', QUOTENAME(name))
from master.sys.databases
where owner_sid != 0x01
select #sql = 'set nocount on declare #c int set #c = 0 ' + #sql + ' select #c'
exec (#sql)
-- clear plan cache and buffers after each run
dbcc freeproccache dbcc dropcleanbuffers set #i = #i + 1
end
print convert(varchar(30), getdate(), 121)
set #i = 0 while #i < 5 begin
insert into #t
exec sp_MSforeachdb N'select ''?'', count(*)
from [?].dbo.sysobjects
where type = ''U'''
select SUM(NumTables) as TotalTables
from #t
where DBName not in ('distribution','master','model','msdb','tempdb')
-- unfortunately this is required
delete from #t
-- clear plan cache and buffers after each run
dbcc freeproccache dbcc dropcleanbuffers set #i = #i + 1
end
print convert(varchar(30), getdate(), 121)
The result obtained for only 5 invocations (loop iterations) of each. YMMV
start : 2011-01-21 14:21:45.180
end of string-concat : 2011-01-21 14:21:57.497 (12.317)
end of sp_msforeachdb : 2011-01-21 14:22:13.937 (16.440)
It has to be noted that the temp table has to be emptied between each iteration of the 2nd approach, so that could contribute to the total time. It should have been insignificant though
Here is an answer that does not use undocumented functions and works in SQL Server 2005, 2008 and 2008R2. This answer can be used with minor modifications to run any statement across databases.
DECLARE #sql varchar(200), #dbname sysname, #dbid smallint;
CREATE table #alltables
(dbname sysname,
[number of tables] int);
SELECT top 1 #dbname = name, #dbid = database_id
FROM sys.databases
where database_id > 4;
WHILE (#dbname is not null)
begin
-- the statement below could contain any valid select statement
set #sql = 'use ' + #dbname + '; insert into #alltables select ''' + #dbname + ''', count(*) from sys.tables';
EXEC (#sql)
set #dbname = null;
SELECT top 1 #dbname = name, #dbid = database_id
FROM sys.databases
where database_id > #dbid;
end;
select * FROM #alltables;
SELECT sum([number of tables]) "Total Number of Tables in all user databases" from #alltables;
drop table #alltables;
Select Count(*)
From INFORMATION_SCHEMA.TABLES
Where TABLE_TYPE = 'BASE TABLE'
If what you are seeking is a way to determine how many tables exist across all databases on a given SQL Server instance, then you need to cycle through each database. One way would be:
Declare #Databases Cursor
Declare #DbName as nvarchar(64)
Declare #SQL nvarchar(max)
Declare #BaseSQL nvarchar(max)
Declare #Count int
Declare #TotalCount int
Set #Databases = Cursor Fast_Forward For
select [name]
from master..sysdatabases
where [name] Not In('master','model','msdb','tempdb')
Open #Databases
Fetch Next From #Databases Into #DbName
Set #BaseSQL = 'Select #Count = Count(*)
From DatabaseName.INFORMATION_SCHEMA.TABLES
Where TABLE_TYPE = ''BASE TABLE'''
Set #TotalCount = 0
While ##Fetch_Status = 0
Begin
Set #Count = 0
Set #SQL = Replace(#BaseSQL, 'DatabaseName', QuoteName(#DbName))
exec sp_executesql #SQL, N'#Count int OUTPUT', #Count OUTPUT
Set #TotalCount = #TotalCount + #Count
Fetch Next From #Databases Into #DbName
End
Close #Databases
Deallocate #Databases
Select #TotalCount
This solution has the advantage of not using any undocumented features such as sp_MSforeachdb however it is obviously more verbose.