SQL Server ambiguous query validation - sql-server

I have just come across a curious SQL Server behaviour.
In my scenario I have a sort of dynamic database, so I need to check the existence of tables and columns before run queries involving them.
I can't explain why the query
IF 0 = 1 -- Check if NotExistingTable exists in my database
BEGIN
SELECT NotExistingColumn FROM NotExistingTable
END
GO
executes correctly, but the query
IF 0 = 1 -- Check if NotExistingColumn exists in my ExistingTable
BEGIN
SELECT NotExistingColumn FROM ExistingTable
END
GO
returns Invalid column name 'NotExistingColumn'.
In both cases the IF block is not executed and contains an invalid query (the first misses a table, the second a column).
Is there any reason why SQL engine checks for syntax erorrs just in one case?
Thanks in advance

Deffered name resolution:
Deferred name resolution can only be used when you reference nonexistent table objects. All other objects must exist at the time the stored procedure is created. For example, when you reference an existing table in a stored procedure you cannot list nonexistent columns for that table.

You can look through the system tables for the existence of a specific table / column name
SELECT t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
WHERE c.name LIKE '%colname%'
AND t.name LIKE '%tablename%'
ORDER BY schema_name, table_name;
The query above will pull back all tables / columns with partial match of a columnname and tablename, just change the like % for exact match.

Related

How to check a database in SQL to make sure that all of its tables are in use?

I was assigned to see if all of the current tables in a database are used and if not to drop them. These are the steps I have taken so far:
Searched tables names in the program that uses that database to see if a query has been made in the program based on those tables names.
Investigated if a table primary key has been used in any other places such as view or table (Connectivity with other used tables). I used:
SELECT
t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM
sys.tables AS t
INNER JOIN
sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
WHERE
c.name LIKE 'DeflectionId' -- write the column you search here
ORDER BY
schema_name, table_name;
Searched inside all of the stored procedure texts to see if a table name has been used inside them:
SELECT DISTINCT
o.name AS Object_Name,
o.type_desc
FROM
sys.sql_modules m
INNER JOIN
sys.objects o ON m.object_id = o.object_id
WHERE
m.definition LIKE '%\[Test_Results_LU\]%' ESCAPE '\';
or
SELECT name
FROM sys.procedures
WHERE Object_definition(object_id) LIKE '%Test_Results_LU%'
(from this link: Search text in stored procedure in SQL Server )
Used Object Explorer view to see if a table with the similar/same name and size exists in the database.
Do you think there are other ways that I can use to investigate it better?
Are these steps efficient at all? How would you do it?
Those are all reasonable things to check. One more thing to do would be to turn on profiling or auditing, depending on your SQL server version, and actually monitor for the tables being used for a reasonable time period. You may not be able to do that with a production system, and it's still not 100% guaranteed - what if there's an important table that's only queried once a year?
https://dba.stackexchange.com/questions/40960/logging-queries-and-other-t-sql
https://learn.microsoft.com/en-us/sql/relational-databases/security/auditing/view-a-sql-server-audit-log?view=sql-server-2017
One other suggestion before dropping the tables is to explicitly remove access to them (either with DENY/REVOKE or rename them to table-name_purge) for a week or two and see if anyone complains. If they don't, then it's probably safe to make a backup and then drop them.
A couple of other places to check. Both of these rely on data that is
cached automatically by the system
not persisted between restarts
can be dropped at any time.
so absence from these results does not prove that the table is not used but you may find evidence that a table definitely is in use.
SELECT [Schema] = OBJECT_SCHEMA_NAME(object_id),
[ObjectName] = OBJECT_NAME(object_id),
*
FROM sys.dm_db_index_usage_stats
WHERE database_id = DB_ID()
And in the plan cache
USE YourDB
DROP TABLE IF EXISTS #cached_plans, #plans, #results
DECLARE #dbname nvarchar(300) = QUOTENAME(DB_NAME());
SELECT dm_exec_query_stats.creation_time,
dm_exec_query_stats.last_execution_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.sql_handle,
dm_exec_query_stats.plan_handle
INTO #cached_plans
FROM sys.dm_exec_query_stats;
WITH distinctph
AS (SELECT DISTINCT plan_handle
FROM #cached_plans)
SELECT query_plan,
plan_handle
INTO #plans
FROM distinctph
CROSS APPLY sys.dm_exec_query_plan(plan_handle);
WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT cp.*,
st.text,
[Database] = n.value('#Database', 'nvarchar(300)'),
[Schema] = n.value('#Schema', 'nvarchar(300)'),
[Table] = n.value('#Table', 'nvarchar(300)')
INTO #results
FROM #cached_plans cp
JOIN #plans p
ON cp.plan_handle = p.plan_handle
CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
CROSS APPLY query_plan.nodes('//Object[#Database = sql:variable("#dbname") and #Schema != "[sys]"]') qn(n);
SELECT *
FROM #results

SQL Server: what do tables named #ABCDEF01 contain?

I execute request
select *
from tempdb.sys.objects
where type_desc = 'USER_TABLE'
and see tables named like #AB12CD34, #ABCDEF01, etc.
I don't use such naming convention for temp tables. Is that possible to determine real names for these tables?
Any table that starts with '#' is a temporary table that exists until the session or connection is lost. The table is visible only within the current session. Any table that starts with '##' is a similar type table except that it is global in nature and other sessions / connections can see it.
This is not the system naming convention for standard temporary tables.
Temporary tables will generally show up as a 128 character name in the format
#YourTempTableName______________ ... _________00000000000D
Where the hex at the end acts to prevent collisions between different sessions.
Tables named #AB12CD34 are either table variables/table valued parameters or they are cached temporary tables from stored procedures.
When the stored procedure finishes executing a temp table can be cached so it does not have to be re-created again on next use. The FCheckAndCleanupCachedTempTable transaction renames the temp tables to this format as part of this process.
More about temporary table caching in this blog post.
The cached temp tables belong to the execution context of a cached execution plan. You can see stored procedures with cached execution contexts with
SELECT DB_NAME(dbid) AS DatabaseName,
OBJECT_NAME(objectid, dbid) AS ObjectName
FROM sys.dm_exec_cached_plans cp
CROSS apply sys.dm_exec_sql_text(cp.plan_handle) t
JOIN sys.dm_os_memory_objects m1
ON m1.memory_object_address = cp.memory_object_address
JOIN sys.dm_os_memory_objects m2
ON m1.page_allocator_address = m2.page_allocator_address
WHERE m2.type = 'MEMOBJ_EXECUTE'
AND cp.objtype = 'Proc'
You can also see cached temp tables with
select *
from sys.dm_os_memory_cache_entries
where name='tempdb' AND entry_data LIKE '<entry database_id=''2'' entity_type=''object'' entity_id=''-%'
But I don't see any way of linking these together to see which plan caches what temp object.
You could look at the column names and see if you recognize the table structure from one of your procs.
WITH T
AS (SELECT *
FROM tempdb.sys.objects
WHERE type_desc = 'USER_TABLE'
AND name = '#' + CONVERT(VARCHAR, CAST(object_id AS BINARY(4)), 2))
SELECT T.name,
c.name,
type_name(c.user_type_id) AS Type
FROM T
JOIN tempdb.sys.columns c
ON c.object_id = T.object_id;

Find multiple tables with duplicate data in sql server

I have many tables in my database. I want to create a general query which will search for duplicate records in all columns of all the tables in a database in SQL server.
Something like this,
select
T.NAME as TABLE_NAME,
C.NAME as COLUMN_NAME
from
SYS.TABLES as T
inner join
SYS.COLUMNS C on T.OBJECT_ID = C.OBJECT_ID
group by
T.NAME, C.NAME
having
count(*) > 1
I do not know how to do it or if there is any way to do it.
You can find duplicate column names with something like this:
select column_name
from information_schema.columns
group by column_name
having count(*) > 1;
You can join back to column to get the table names. You might also want to limit the search to base tables (it is unclear what you mean by "duplicates").
If you want a quick way to see if two tables are the same you can start with CHECKSUM_AGG and compare the results against two tables. If they have the same columns types and the results of checksum_agg are equal the contents are the same regardless of the row order.

How can i join sys.columns of view to sys.columns of the table it is referencing in T-SQL?

I am trying to join records in sys.columns for a view, to the records in sys.columns for the table it is referencing, because i need the values of is_nullable, is_computed and default_object_id columns for the columns that are selected in the view.
The sys.columns records for the view have "incorrect" values, which you can observe by running the example queries below:
CREATE TABLE TestTable (
FieldA int NOT NULL,
FieldB int DEFAULT (1),
FieldC as CONVERT(INT, FieldA + FieldB),
FieldD int NOT NULL
)
GO
CREATE VIEW TestView WITH SCHEMABINDING AS
SELECT FieldA, FieldC as TestC, FieldB + FieldC as TestD
FROM dbo.TestTable WHERE FieldD = 1
GO
SELECT OBJECT_NAME(c.object_id) as ViewName, c.name as ColumnName,
c.is_nullable as Nullable, c.is_computed as Computed,
cast(CASE WHEN c.default_object_id > 0 THEN 1 ELSE 0 END as bit) as HasDefault
FROM sys.columns c
WHERE object_id = OBJECT_ID('TestTable')
GO
SELECT OBJECT_NAME(c.object_id) as ViewName, c.name as ColumnName,
c.is_nullable as Nullable, c.is_computed as Computed,
cast(CASE WHEN c.default_object_id > 0 THEN 1 ELSE 0 END as bit) as HasDefault
FROM sys.columns c
WHERE object_id = OBJECT_ID('TestView')
GO
I have tried using system views to join on dependencies, but they do not give us information about which column in the view refers to which column in the table:
-- dm_sql_referenced_entities gives us all columns referenced, but all records
-- have referencing_minor_id 0, so we do not know which column refers to what
SELECT * FROM sys.dm_sql_referenced_entities('dbo.TestView', 'OBJECT')
GO
-- sql_dependencies gives us all columns referenced, but all records has
-- column_id 0 so we can not use this either of joining the columns
SELECT * FROM sys.sql_dependencies WHERE object_id = OBJECT_ID('TestView')
GO
-- sql_expression_dependencies just tells us what table we are referencing
-- if view is not created WITH SCHEMABINDING. If it is, it will return columns,
-- but with referencing_minor_id 0 for all records, so not able use this either
SELECT * FROM sys.sql_expression_dependencies
WHERE referencing_id = OBJECT_ID('TestView')
GO
This unanswered post on social.msdn submitted by someone seems to be the same issue:
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/4ae5869f-bf64-4eef-a952-9ac40c932cd4
You said "i need to know that TestView TestC refers to a computed column". This is not supported by SQL Server 2008 R2 (not sure for 2012 though, but i doubt it).
First you can query sys.columns or INFORMATION_SCHEMA.COLUMNS and you won't find what you want.
If you dig deeper, you will most probably try sys.sql_expression_dependencies and sys.dm_sql_referenced_entities (N'dbo.TestView', N'OBJECT'), but you can find table-column mapping there, not column-column. SQL server stores dependency information by 'high level' object (table, trigger...), not by its details (column). You will find same in sys.sysdepends. As a matter of fact dependency information is in SQL server unreliable.
At last, your only possibility would be to parse the view body by yourself. It can be found in sys.sql_modules:
SELECT m.definition
FROM
sys.objects o
JOIN sys.sql_modules m
ON m.object_id = o.object_id
WHERE
o.object_id = object_id('dbo.TestView')
and o.type = 'V'
Parsing T-SQL is VERY hard, it could really push you to the limit of your efforts. For instance, it should be more or less easy to grab table references from the view, and then table columns, especially if your view is schema-bound. But if it's not, well... just think of asterisks that reference OUTER APPLY, which references recursive CTE...
Anyway, good luck!
Currently, when views are created there are two operations that happen at a high level - parsing & binding. Parsing is basically checking for syntax of the statement, keywords & such. Binding is the process of mapping the identifiers (object names, column names) in the statement to the corresponding objects (tables, views, functions, columns etc.) & derivation of types. Additionally, in case of view similar to SELECT statements you can optionally alias column references and expressions in the SELECT list or after the view name (ex: create view v1(a) as select i from t).
After binding, we persist only the column aliases & the derived types in the metadata since a view is logically a table derived from a query expression. So there is currently no way to determine the expression that the column aliases map to or what it contains (columns or functions or literals etc.)
Only way to obtain the information you are looking for is to parse the view definition & perform your own binding. I believe we already have a bug that tracks the feature request to expose more richer dependency information regarding the mapping of aliases to column expressions in view definitions.
Lastly, SQL Server Developer Studio or Visual Studio Database Project uses the managed T-SQL parser to track such references so you can do refactoring or renaming for example using the project.
Hope this helps clarify the problem/current implementation.
It seems like the misunderstanding is the assumption that object_id is a primary key in sys.columns. It is not. The object_id in sys.columns relates to the object_id in sys.objects.
So:
SELECT C.*
FROM sys.objects T
INNER JOIN sys.columns C
ON T.object_id = C.object_id
WHERE T.type in ('S','U') -- System Tables and User Tables
AND T.name = 'Address' -- Table Name
order by C.Column_ID
will return the columns in the "Address" table in AdventureWorks.

How can I tell if a database table is being accessed anymore? Want something like a "SELECT trigger"

I have a very large database with hundreds of tables, and after many, many product upgrades, I'm sure half of them aren't being used anymore. How can I tell if a table is is actively being selected from? I can't just use Profiler - not only do I want to watch for more than a few days, but there are thousands of stored procedures as well, and profiler won't translate the SP calls into table access calls.
The only thing I can think of is to create a clustered index on the tables of interest, and then monitor the sys.dm_db_index_usage_stats to see if there are any seeks or scans on the clustered index, meaning that data from the table was loaded. However, adding a clustered index on every table is a bad idea (for any number of reasons), as isn't really feasible.
Are there other options I have? I've always wanted a feature like a "SELECT trigger", but there are probably other reasons why SQL Server doesn't have that feature either.
SOLUTION:
Thanks, Remus, for pointing me in the right direction. Using those columns, I've created the following SELECT, which does exactly what I want.
WITH LastActivity (ObjectID, LastAction) AS
(
SELECT object_id AS TableName,
last_user_seek as LastAction
FROM sys.dm_db_index_usage_stats u
WHERE database_id = db_id(db_name())
UNION
SELECT object_id AS TableName,
last_user_scan as LastAction
FROM sys.dm_db_index_usage_stats u
WHERE database_id = db_id(db_name())
UNION
SELECT object_id AS TableName,
last_user_lookup as LastAction
FROM sys.dm_db_index_usage_stats u
WHERE database_id = db_id(db_name())
)
SELECT OBJECT_NAME(so.object_id) AS TableName,
MAX(la.LastAction) as LastSelect
FROM sys.objects so
LEFT
JOIN LastActivity la
on so.object_id = la.ObjectID
WHERE so.type = 'U'
AND so.object_id > 100
GROUP BY OBJECT_NAME(so.object_id)
ORDER BY OBJECT_NAME(so.object_id)
Look in sys.dm_db_index_usage_stats. The columns last_user_xxx will contain the last time the table was accessed from user requests. This table resets its tracking after a server restart, so you must leave it running for a while before relying on its data.
Re: Profiler, if you monitor for SP:StmtCompleted, that will capture all statements executing within a stored procedure, so that will catch table accesses within a sproc. If not everything goes through stored procedures, you may also need the SQL:StmtCompleted event.
There will be a large number of events so it's probably still not practical to trace over a long time due to the size of trace. However, you could apply a filter - e.g. where TextData contains the name of your table you want to check for. You could give a list of table names to filter on at any one time and work through them gradually. So you should not get any trace events if none of those tables have been accessed.
Even if you feel it's not a suitable/viable approach for you, I thought it was worth expanding on.
Another solution would be to do a global search of your source code to find references to the tables. You can query the stored procedure definitions to check for matches for a given table, or just generate a complete database script and do a Find on that for table names.
For SQL Server 2008 you should take a look at SQL Auditing. This allows you to audit many things including selects on a table and reports to a file or Events Log.
The following query uses the query plan cache to see if there's a reference to a table in any of the existing plans in cache. This is not guaranteed to be 100% accurate (since query plans are flushed out if there are memory constraints) but can be used to get some insights on table use.
SELECT schema_name(schema_id) as schemaName, t.name as tableName,
databases.name,
dm_exec_sql_text.text AS TSQL_Text,
dm_exec_query_stats.creation_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.total_worker_time AS total_cpu_time,
dm_exec_query_stats.total_elapsed_time,
dm_exec_query_stats.total_logical_reads,
dm_exec_query_stats.total_physical_reads,
dm_exec_query_plan.query_plan
FROM sys.dm_exec_query_stats
CROSS APPLY sys.dm_exec_sql_text(dm_exec_query_stats.plan_handle)
CROSS APPLY sys.dm_exec_query_plan(dm_exec_query_stats.plan_handle)
INNER JOIN sys.databases ON dm_exec_sql_text.dbid = databases.database_id
RIGHT JOIN sys.tables t (NOLOCK) ON cast(dm_exec_query_plan.query_plan as varchar(max)) like '%' + t.name + '%'
I had in mind to play with user permissions for different tables, but then I remembered you can turn on trace with an ON LOGON trigger you might benefit from this:
CREATE OR REPLACE TRIGGER SYS.ON_LOGON_ALL
AFTER LOGON ON DATABASE
WHEN (
USER 'MAX'
)
BEGIN
EXECUTE IMMEDIATE 'ALTER SESSION SET SQL_TRACE TRUE';
--EXECUTE IMMEDIATE 'alter session set events ''10046 trace name context forever level 12''';
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
/
Then you can check your trace files.
This solution works better for me then the solution above. But, is still limted that the server was not re-started as well, but still gives you a good idea of tables not used.
SELECT [name]
,[object_id]
,[principal_id]
,[schema_id]
,[parent_object_id]
,[type]
,[type_desc]
,[create_date]
,[modify_date]
,[is_ms_shipped]
,[is_published]
,[is_schema_published]
FROM [COMTrans].[sys].[all_objects]
where object_id not in (
select object_id from sys.dm_db_index_usage_stats
)
and type='U'
order by name

Resources