Find computed columns in SQL Server views - sql-server

A colleague asked me to help them to identify the views in a database that have one or more computed columns. The database has hundreds of views so they're trying to find an automated way to accomplish this task. I am not seeing the results in the database that I was expecting. Here is an example:
--DROP TABLE dbo.Products
CREATE TABLE dbo.Products
(
ProductID int IDENTITY (1,1) NOT NULL
, QtyAvailable smallint
, UnitPrice money
);
--DROP VIEW dbo.uvw_Products
CREATE VIEW dbo.uvw_Products
AS
SELECT ProductID
, QtyAvailable
, UnitPrice
, (QtyAvailable * UnitPrice) AS InventoryValue
FROM dbo.Products;
-- Look at the view and find the computed column
SELECT OBJECT_SCHEMA_NAME(T.[object_id],DB_ID()) AS [Schema],
T.[name] AS [table_name], AC.[name] AS [column_name],
TY.[name] AS system_data_type, AC.[max_length],
AC.[precision], AC.[scale], AC.[is_nullable], AC.[is_ansi_padded], AC.[is_computed]
FROM sys.[views] AS T
INNER JOIN sys.[all_columns] AC ON T.[object_id] = AC.[object_id]
INNER JOIN sys.[types] TY ON AC.[system_type_id] = TY.[system_type_id] AND AC.[user_type_id] = TY.[user_type_id]
WHERE T.[is_ms_shipped] = 0
AND T.[name] = 'uvw_Products'
ORDER BY T.[name], AC.[column_id]
-- Pulls up no results - no entries in sys.computed_columns
SELECT TOP 10 *
FROM sys.computed_columns C
INNER JOIN sys.views V ON C.[object_id] = V.[object_id]
WHERE V.[name] = 'uvw_Products'
As you can see from this simple example, SQL Server does not seem to be storing the value in the is_computed column.
What am I missing? How can we find the computed columns in views?

I know it's not ideal, I don't think a fool-proof solution exists for this problem, but depending on your naming conventions you could do a join to the tables sys view on column names so see which columns exist and don't exist in there.
for example:
SELECT v.*
FROM
(
SELECT [Schema] = OBJECT_SCHEMA_NAME(t.[object_id], DB_ID())
, [table_name] = t.[name]
, [column_name] = vc.[name]
FROM sys.[views] t
JOIN sys.[all_columns] vc ON t.[object_id] = vc.[object_id]
) v
LEFT JOIN
(
SELECT [Schema] = OBJECT_SCHEMA_NAME(t.[object_id], DB_ID())
, [table_name] = t.[name]
, [column_name] = vc.[name]
FROM sys.[tables] t
JOIN sys.[all_columns] vc ON t.[object_id] = vc.[object_id]
) t
ON t.[column_name] = v.[column_name]
WHERE t.[table_name] IS NULL
Returns:
Schema | table_name | column_name
----------------------------------------
dbo | uvw_Products | InventoryValue
And if your views contain the source table in its name like 'uvw_Products' you could also use that in your join to avoid columns in other tables getting in the way.
Again its not ideal but a relatively simple solution to narrow the search

Related

Join INFORMATION_SCHEMA.TABLES with another table with more TABLE_NAMES and result would be different if table name is in the first table or not

we have a Table with a list of table names we want to be created. They don't have an ID column or anything, it's just a few rows of data with 2 columns. Thing is we want to merge that table with Information_schema.table to check which of the tables we have already created and which we have not, so we wrote the query below as a temp to achieve such:
with cte1 as (
select d.TABNAME, d.CLASS from dbo.table_list as d
left join INFORMATION_SCHEMA.TABLES as t on t.TABLE_NAME = d.TABNAME
where d.CLASS in ('INIT','STERN') and table_schema = 'dbo'),
cte2 as (select d.TABNAME, d.CLASS
from dbo.table_list as d
where d.CLASS in ('INIT','TERN') and d.TABNAME not in (select [TABLE NAME] from cte1))
select *, 'Active' as [Status] from cte1 union all
select * , 'Inactive' from cte2
This is what table_list looks like:
TABNAME
CLASS
TABLE1
INIT
TABLE2
STERN
TABLE3
STERN
TABLE4
STERN
TABLE5
INIT
We already have TABLE1 and TABLE2 created so the result of the query looks like this:
TABNAME
CLASS
STATUS
TABLE1
INIT
Active
TABLE2
STERN
Active
TABLE3
STERN
Inactive
TABLE4
STERN
Inactive
TABLE5
INIT
Inactive
It works well enough like this but we were wondering if we could make it shorter.
This can be way shorter, yes. You could just reference the table dbo.table_list and see if you get a valid OBJECT_ID:
SELECT tl.TABNAME,
tl.CLASS,
CASE WHEN OBJECT_ID(N'dbo.' + QUOTENAME(tl.TABNAME)) IS NULL THEN 'Inactive' ELSE 'Active' END AS Status
FROM dbo.table_list tl --"d" for "table_list" doesn't make a lot of sense.
WHERE tl.CLASS IN ('INIT','STERN');
If you wanted to use the catalog views, you could use CROSS APPLY to join to the table while supplying a value for both the schema and table name, or just JOIN to sys.schemas based on a literal and then LEFT JOIN to sys.tables:
SELECT tl.TABNAME,
tl.CLASS,
CASE WHEN st.[name] IS NULL THEN 'Inactive' ELSE 'Active' END AS Status
FROM dbo.table_list tl --"d" for "table_list" doesn't make a lot of sense.
CROSS APPLY (SELECT t.[name]
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
WHERE s.[name] = N'dbo'
AND t.[name] = tl.TABNAME) st
WHERE tl.CLASS IN ('INIT','STERN');
SELECT tl.TABNAME,
tl.CLASS,
CASE WHEN t.[name] IS NULL THEN 'Inactive' ELSE 'Active' END AS Status
FROM dbo.table_list tl --"d" for "table_list" doesn't make a lot of sense.
JOIN sys.schemas s ON s.[name] = N'dbo'
LEFT JOIN sys.tables t ON s.schema_id = t.schema_id
AND tl.TABNAME = t.[name]
WHERE tl.CLASS IN ('INIT','STERN');

Does a SELECT COUNT(*) query have to do a full table scan?

Does a query that gets the count of all rows in a table have to do a full table scan or does SQL Server maintain a count of rows somewhere?
SELECT COUNT(*) FROM TABLE_NAME;
The table TABLE_NAME has a primary key, and therefore a clustered index, and looks like so:
CREATE TABLE TABLE_NAME
(
Id int PRIMARY KEY IDENTITY(1, 1),
Name nvarchar(50) NOT NULL
);
I am using Microsoft SQL Server 2014.
The server will always read all records (if there's an index then it will scan the entire index) to count the rows. You can't escape this as long as you are doing SELECT COUNT(*) FROM Table.
If your table has a clustered index, you can change your query to an "under the hood" query to retrieve the count without actually fetching the records with:
SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count]
FROM sys.sysindexes i WITH (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rowcnt desc
if you are looking for an approximate count of the records, you can also use the following query:
SELECT
TableName = t.NAME,
SchemaName = s.Name,
[RowCount] = p.rows,
TotalSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.total_pages) * 8 / 1024.0),
UsedSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.used_pages) * 8 / 1024.0),
UnusedSpaceMB = CONVERT(DECIMAL(18,2), (SUM(a.total_pages) - SUM(a.used_pages)) * 8 / 1024.0)
FROM
sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
t.Name,
s.Name,
p.Rows
ORDER BY
TotalSpaceMB DESC
This will show non-system tables with their calculated (not exact) row count and the sum of the sizes of their data (with any index they might have), relatively fast without retrieving the records.
When SQL Server performs a query like SELECT COUNT(*), SQL Server will use the narrowest non-clustered index to count the rows. If the table does not have any non-clustered index, it will have to scan the table.
If your table has a clustered index you can get your count even faster.
SELECT COUNT(*) FROM TABLE_NAME;
Does a full table scan.
For optimizations you can refer to this.
you can following way. it is better in performance I guess.
SELECT COUNT(1) FROM TABLE_NAME

Updating MS SQL objects with dependencies

I have sql scripts DROP ... CREATE... for views, procedures, triggers, functions (in files on disk). I need to run this scripts to update database structure.
But order of executing scripts is important. Because if view1 depends on view2, and I run scripts for view1 first, an error may occurs.
Let's suppose that there are no adscititious dependencies in my scripts, so database knows about all dependencies.
Is there a way to select all this objects names from sql server in dependency order? So I can run scripts in this order and not afraid about above errors.
I wrote this sql:
SET NOCOUNT ON
declare #deps TABLE (name nvarchar(512), dep_name nvarchar(512))
declare #ordered TABLE (name nvarchar(512), [level] INT)
insert #deps(name, dep_name)
SELECT DISTINCT CAST(OBJ.name as nvarchar(512)) AS ObjectName,
CAST(REFOBJ.name as nvarchar(512)) AS ReferencedObjectName
FROM sys.sql_dependencies AS DEP
INNER JOIN
sys.objects AS OBJ
ON DEP.object_id = OBJ.object_id
INNER JOIN
sys.schemas AS SCH
ON OBJ.schema_id = SCH.schema_id
INNER JOIN sys.objects AS REFOBJ
ON DEP.referenced_major_id = REFOBJ.object_id
INNER JOIN sys.schemas AS REFSCH
ON REFOBJ.schema_id = REFSCH.schema_id
LEFT JOIN sys.columns AS REFCOL
ON DEP.class IN (0, 1)
AND DEP.referenced_minor_id = REFCOL.column_id
AND DEP.referenced_major_id = REFCOL.object_id
WHERE OBJ.type_desc IN ('VIEW','SQL_STORED_PROCEDURE','SQL_INLINE_TABLE_VALUED_FUNCTION','SQL_TRIGGER','SQL_SCALAR_FUNCTION')
AND REFOBJ.type_desc IN ('VIEW','SQL_STORED_PROCEDURE','SQL_INLINE_TABLE_VALUED_FUNCTION','SQL_TRIGGER','SQL_SCALAR_FUNCTION')
insert #ordered(name, [level])
select distinct d1.dep_name, 1 as [level]
from #deps d1
LEFT JOIN #deps d2 ON d1.dep_name = d2.name
where d2.name IS NULL
WHILE EXISTS(select * from #deps)
BEGIN
delete d
FROM #deps d
JOIN #ordered o ON d.name = o.name
insert #ordered(name, [level])
SELECT DISTINCT d0.name, (select MAX([level]) + 1 FROM #ordered)
FROM #deps d0
LEFT JOIN (
SELECT d.name
from #deps d
LEFT JOIN #ordered o ON d.dep_name = o.name
WHERE o.name IS NULL
) dfilter ON d0.name = dfilter.name
WHERE dfilter.name is NULL
END
select * from #ordered
ORDER by level asc

Query to return distinct values in fields containing string across multiple tables

The following query looks for fields within a db that contain '%string%' and returns them in a table with 5 columns; Schema, Table, Number of fields containing desired string in title of field within table, number of rows in table, and finally the field names.
SELECT s.name schemaName
, t.name tabName
, COUNT(c.name) OVER (PARTITION BY t.name ORDER BY t.name) totalColsWithString
, rc.row_count
, c.name colName
FROM sys.all_columns c
JOIN sys.tables t ON (t.object_id = c.object_id)
JOIN sys.schemas s ON (s.schema_id = t.schema_id)
LEFT JOIN
(
SELECT o.name
, ddps.row_count
FROM sys.indexes i
JOIN sys.objects o ON (i.object_id = o.object_id)
JOIN sys.dm_db_partition_stats AS ddps ON (i.object_id = ddps.object_id AND i.index_id = ddps.index_id)
WHERE i.index_id < 2 AND o.is_ms_shipped = 0
) rc ON (rc.name = t.name)
WHERE c.name LIKE '%String%'
AND row_count <> 0;
What I now want is a field that shows the number of distinct values in those fields which contain 'string' in the title (in all the columns returned in above query).
Does MS SQL Server store any info about distinct values in fields? Can it be made to?
Updated answer
You actually should consider to run 2 queries in that case. Or use subqueries, which may cause performance issues at a big dataset. Subqueries could look like this
SELECT [...],
(SELECT COUNT(*) FROM all_columns c2 WHERE c2.name= c.name) AS totalcolswithstring
FROM [...]
I set up a fiddle for you SqlFiddle

TSQL to Eliminate Repetitive Query

I have pretty basic table schema.
Table A
TEMPLATE_ID TEMPLATE_NAME
Table A has the following rows
1 Procs
2 Letter
3 Retire
4 Anniversary
5 Greet
6 Event
7 Meeting
8... etc.
Table B
TEMPLATE_ID VALUE
Table B has 100K+ rows with TEMPLATE_ID connecting the two tables.
Now the execs want a sample of 20 records of types 1-5 from table A. I could do something basic...which is about my speed when it comes to TSQL.
SELECT TOP(20) B.VALUE FROM TableB
JOIN TableA ON
B.TEMPLATE_ID = A.TEMPLATE_ID
AND TableA.TEMPLATE_NAME IN ('Procs', 'Letter'...)
But that isn't quite right as I end up with 20 rows...in other words I was expecting 100 rows. 20 for each.
Is this one of those areas where partition could be used. I can see how I would break TableB into partitions for each template (tableA) but I'm not sure how I would limit it to 20 rows.
OK so I could just cut and past into Excel 20 rows from each partition...I could also write 5 very basic queries...but this is kind of an academice...improve my knowledge pursuit.
So to clarify. 20 records from each of the first r template types.
TIA
you can use ROW_NUMBER and partition the data based on the template_name and return only 20 from each partition
SELECT * FROM
(
SELECT B.VALUE,
ROW_NUMBER() OVER ( PARTITION BY TableA.TEMPLATE_NAME ORDER BY ( select NULL)) as seq
FROM
TableB
JOIN TableA ON
B.TEMPLATE_ID = A.TEMPLATE_ID
) T
where T.seq <=20
order by B.VALUE
Could you try?
SELECT B.VALUE
FROM
(
SELECT TEMPLATE_ID,VALUE, DENSE_RANK ( ) OVER (PARTITION BY TEMPLATE_ID ORDER BY VALUE DESC) AS RANK_NO
FROM TABLE_B
) B INNER JOIN TABLE_A A ON (A.TEMPLATE_ID = B.TEMPLATE_ID)
WHERE A.TEMPLATE_NAME IN ('Procs', 'Letter'...)
AND B.RANK_NO <= 20
;
You use a ranking function. You first partition your data, order each partition and apply the ranking function:
select seq = row_number() over (
partition by table_catalog , table_schema , table_name
order by column_name
) ,
*
from information_schema.COLUMNS
The above code partitions the rows in information_schame.COLUMNS on the fully-qualified table/view name to which they belong. Each partition is then ordered alphabetically and given a row_number().
That then gets wrapped in another select which makes use of it. This code pulls the first 3 columns for each table in the system based on column and provides some information about it:
select t.table_name ,
t.table_schema ,
t.table_name ,
t.table_type ,
c.seq ,
c.ordinal_position ,
c.COLUMN_NAME ,
data_type = c.data_type + coalesce('('+convert(varchar,c.character_maximum_length)+')','')
+ case c.is_nullable when 'yes' then ' is null' else ' is not null' end
from information_schema.tables t
join ( select seq = row_number() over (
partition by table_catalog , table_schema , table_name
order by column_name
) ,
*
from information_schema.COLUMNS
) c on c.table_catalog = t.table_catalog
and c.table_schema = t.table_schema
and c.table_name = t.table_name
where c.seq <= 3
order by t.table_catalog ,
t.table_schema ,
t.table_name ,
c.seq
SELECT * FROM
( SELECT B.VALUE, TableA.TEMPLATE_NAME
ROW_NUMBER() OVER ( PARTITION BY A.TEMPLATE_ID ORDER BY NEWID() ) as row
FROM TableB
JOIN TableA
ON A.TEMPLATE_ID = B.TEMPLATE_ID
AND A.TEMPLATE_ID <= 5
) T
where T.row <= 20
order by B.VALUE

Resources