How to check if data has changed in database - sql-server

In order to create a Dashboard I need to work on a database that I do not have information about. The data in the DataBase is updated through a particular management interface and I do not know where it is updated on the DB. Is there a way to check for data updates without the tables names or columns names?
Thanks

I agree that knowing nothing about the database will not get you very far, but you can get some information about table updates without knowing what you're looking for. Whether or not it's useful is a different question.
The included code leverages sys.dm_db_index_usage_stats to identify the last time a user update was performed on the table. This works for heaps as well as indexed tables and the user_update value includes inserts, deletes, and updates. The sys.dm_db_index_usage_stats view will include information only for tables that have been interacted. To work around that I UNION ALL with a second query to get the relevant object information for tables that aren't found in sys.dm_db_index_usage_stats. This gives a complete view of the table objects regardless of whether or not they've been used since the service started. You may not care about that at all and could strip it out.
Again, this may not be helpful, but your question was just "Is there a way to check for data updates without the tables names or columns names?" and the answer to that specific question is yes.
Caveats:
The user_update value is simply an incrementing counter of update actions. This will not let you know what the update was or how many updates have occured since X point in time, but it will let you know the last time a table was updated.
This information does not persist beyond a service restart, so if a table was updated right before a restart, you wouldn't know.
The provided script is database specific, meaning it only returns information about the datbase it runs in. You could use something like sp_MSforeachdb to run against everything though.
And the code...
SELECT * FROM
(
SELECT
##servername as servername
, DB_NAME(database_id) as DatabaseName
, u.object_id
, SchemaName = OBJECT_SCHEMA_NAME(u.object_id, database_Id)
, TableName = OBJECT_NAME(u.object_id, database_id)
, Writes = SUM(user_updates)
, LastUpdate = CASE WHEN MAX(u.last_user_update) IS NULL THEN CAST('17530101' AS DATETIME) ELSE MAX(u.last_user_update) END
FROM sys.dm_db_index_usage_stats u
JOIN sys.indexes i
ON u.index_id = i.index_id
AND u.object_id = i.object_id
WHERE u.database_id = DB_ID()
GROUP BY database_id, u.object_id
UNION ALL
SELECT
##servername as servername
, DB_NAME() as DatabaseName
, o.object_id
, OBJECT_SCHEMA_NAME(o.object_id, db_id())
, object_name(o.object_id, db_id())
, 0
, CAST('17530101' AS DATETIME)
FROM sys.indexes i
JOIN sys.objects o ON i.object_id = o.object_id
WHERE o.type_desc in ('user_table')
and i.index_id NOT IN (select s.index_id from sys.dm_db_index_usage_stats s where s.object_id=i.object_id
and i.index_id=s.index_id and database_id = db_id(db_name()) )
) AS temp
ORDER BY writes DESC

Related

How to check a database in SQL to make sure that all of its tables are in use?

I was assigned to see if all of the current tables in a database are used and if not to drop them. These are the steps I have taken so far:
Searched tables names in the program that uses that database to see if a query has been made in the program based on those tables names.
Investigated if a table primary key has been used in any other places such as view or table (Connectivity with other used tables). I used:
SELECT
t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM
sys.tables AS t
INNER JOIN
sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
WHERE
c.name LIKE 'DeflectionId' -- write the column you search here
ORDER BY
schema_name, table_name;
Searched inside all of the stored procedure texts to see if a table name has been used inside them:
SELECT DISTINCT
o.name AS Object_Name,
o.type_desc
FROM
sys.sql_modules m
INNER JOIN
sys.objects o ON m.object_id = o.object_id
WHERE
m.definition LIKE '%\[Test_Results_LU\]%' ESCAPE '\';
or
SELECT name
FROM sys.procedures
WHERE Object_definition(object_id) LIKE '%Test_Results_LU%'
(from this link: Search text in stored procedure in SQL Server )
Used Object Explorer view to see if a table with the similar/same name and size exists in the database.
Do you think there are other ways that I can use to investigate it better?
Are these steps efficient at all? How would you do it?
Those are all reasonable things to check. One more thing to do would be to turn on profiling or auditing, depending on your SQL server version, and actually monitor for the tables being used for a reasonable time period. You may not be able to do that with a production system, and it's still not 100% guaranteed - what if there's an important table that's only queried once a year?
https://dba.stackexchange.com/questions/40960/logging-queries-and-other-t-sql
https://learn.microsoft.com/en-us/sql/relational-databases/security/auditing/view-a-sql-server-audit-log?view=sql-server-2017
One other suggestion before dropping the tables is to explicitly remove access to them (either with DENY/REVOKE or rename them to table-name_purge) for a week or two and see if anyone complains. If they don't, then it's probably safe to make a backup and then drop them.
A couple of other places to check. Both of these rely on data that is
cached automatically by the system
not persisted between restarts
can be dropped at any time.
so absence from these results does not prove that the table is not used but you may find evidence that a table definitely is in use.
SELECT [Schema] = OBJECT_SCHEMA_NAME(object_id),
[ObjectName] = OBJECT_NAME(object_id),
*
FROM sys.dm_db_index_usage_stats
WHERE database_id = DB_ID()
And in the plan cache
USE YourDB
DROP TABLE IF EXISTS #cached_plans, #plans, #results
DECLARE #dbname nvarchar(300) = QUOTENAME(DB_NAME());
SELECT dm_exec_query_stats.creation_time,
dm_exec_query_stats.last_execution_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.sql_handle,
dm_exec_query_stats.plan_handle
INTO #cached_plans
FROM sys.dm_exec_query_stats;
WITH distinctph
AS (SELECT DISTINCT plan_handle
FROM #cached_plans)
SELECT query_plan,
plan_handle
INTO #plans
FROM distinctph
CROSS APPLY sys.dm_exec_query_plan(plan_handle);
WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT cp.*,
st.text,
[Database] = n.value('#Database', 'nvarchar(300)'),
[Schema] = n.value('#Schema', 'nvarchar(300)'),
[Table] = n.value('#Table', 'nvarchar(300)')
INTO #results
FROM #cached_plans cp
JOIN #plans p
ON cp.plan_handle = p.plan_handle
CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
CROSS APPLY query_plan.nodes('//Object[#Database = sql:variable("#dbname") and #Schema != "[sys]"]') qn(n);
SELECT *
FROM #results

SQL Cause of table updates

Is there a way to determine how a table is updated? I have a table that is being updated and I can't figure out how; by an agent job? ssis package? trigger?
I've queried against dm_exec_query_stats and dm_exec_sql_text to determine the statement that is being run, but I don't know where it's being executed from.
SELECT SQL_HANDLE, deqs.plan_handle, deqs.last_execution_time,
dest.text
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE dest.text LIKE '%Update%'
ORDER BY deqs.last_execution_time desc
Use this query for any dependent stored procedure that has insert/update/select queries for a particular table . Alternatively if you have a trigger on the table which captures inserted datetime , you can analyse by checking current running queries.
It would need a lot of digging of any SSIS jobs are scheduled for table update, usually most updates are done through SP's. This below query is only a quick check.
SELECT DISTINCT QUOTENAME(OBJECT_SCHEMA_NAME(referencing.object_id)) + '.' + QUOTENAME(OBJECT_NAME(referencing.object_id)) AS SprocName
,QUOTENAME(OBJECT_SCHEMA_NAME(referenced.object_id)) + '.' + QUOTENAME(OBJECT_NAME(referenced.object_id)) AS ReferencedObjectName
,referenced.type_desc AS ReferencedObjectType
FROM sys.sql_dependencies d
INNER JOIN sys.procedures referencing ON referencing.object_id = d.object_id
INNER JOIN sys.objects referenced ON referenced.object_id = d.referenced_major_id
WHERE referencing.type_desc = 'SQL_STORED_PROCEDURE'
AND referenced.type_desc = 'USER_TABLE'
ORDER BY SprocName
,ReferencedObjectName
You can find out if your table is referenced in any stored procedure, function, view or triggger with this code
suppose the table you are looking for is called tblMyTable
SELECT DISTINCT
o.name AS Object_Name,
o.type_desc,
m.*
FROM sys.sql_modules m
INNER JOIN sys.objects o ON m.object_id = o.object_id
WHERE m.definition Like '%tblMyTable%';
This will return any stored procedure, function, view or triggger that has tblMyTable in its text, so even if it only just reads from it.
So you need to check each one to see if its updating or not.
Yes its some work, but at least it gives you a change to find out if the update is coming from any object in your database.
If this returns nothing then you know that you have to search in your client software.
I know this may not be a good way to do this and I might get burned for this but, here's what I did...
I executed an open transaction against the table; locking it.
BEGIN TRANSACTION
SELECT
TOP 1 *
FROM *tbl*
I then queried dm_exec_requests to find the session_id that is being run
SELECT
sqltext.TEXT,
req.session_id
FROM sys.dm_exec_requests req
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sqltext
I then queried dm_exec_sessions to find the host_name and program_name that is running that session.
SELECT
host_name,
login_time,
program_name
FROM sys.dm_exec_sessions
WHERE session_id = 184
I was able to determine that the culprit was an SSIS package hosted on a different server. Close the locking transaction
Thanks everyone for all the help

Why simple SQL count takes longer while all update,delete,insert possible thru SQL Server Engine only

I see NO way to update, delete, insert record into a table directly without SQL Server engine.
Also SQL Server engine clearly knows any change in table.
But even simple Select Count(*) from tblTableName takes longer. It does not look like the SQL Server engine maintains the counter and updates during every change occurs into a table.
Every time, it starts some processing in the background to know the count from even single table though indexed. What does it do... why not counter flag maintained?
Example :
Select count(*) from Bills
Select count(*) from Claims
In fact, there is such info stored at index level: [see rows column in sys.indexes][1].
You could interrogate that table instead.
Some clues that would make that counter usefulness:
What about the case SELECT count(*) FROM Bills WHERE id_paid = 1?
What if the SELECT COUNT(*) FROM bills needs less resources than querying sys.indexes table?
The counter should be updated every time some are deleted from table. Wouldn't be this an overhead, taking into account the statistics have to be also updated?
If you don't need the absolutely up-to-date, accurate numbers, you can inspect the catalog views of SQL Server which keep track of the number of rows in table (what you called the counter flag) - more or less. On a really busy system with thousands of concurrent users, this number might be not accurate at all times - but for most cases, it's quite good enough, and it doesn't go out and actually count the rows - so it's really fast:
SELECT
t.NAME AS TableName,
p.rows AS RowCounts
FROM
sys.tables t
INNER JOIN
sys.partitions p ON t.object_id = p.OBJECT_ID
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
GROUP BY
t.Name, p.Rows
ORDER BY
t.Name

How can I tell if a database table is being accessed anymore? Want something like a "SELECT trigger"

I have a very large database with hundreds of tables, and after many, many product upgrades, I'm sure half of them aren't being used anymore. How can I tell if a table is is actively being selected from? I can't just use Profiler - not only do I want to watch for more than a few days, but there are thousands of stored procedures as well, and profiler won't translate the SP calls into table access calls.
The only thing I can think of is to create a clustered index on the tables of interest, and then monitor the sys.dm_db_index_usage_stats to see if there are any seeks or scans on the clustered index, meaning that data from the table was loaded. However, adding a clustered index on every table is a bad idea (for any number of reasons), as isn't really feasible.
Are there other options I have? I've always wanted a feature like a "SELECT trigger", but there are probably other reasons why SQL Server doesn't have that feature either.
SOLUTION:
Thanks, Remus, for pointing me in the right direction. Using those columns, I've created the following SELECT, which does exactly what I want.
WITH LastActivity (ObjectID, LastAction) AS
(
SELECT object_id AS TableName,
last_user_seek as LastAction
FROM sys.dm_db_index_usage_stats u
WHERE database_id = db_id(db_name())
UNION
SELECT object_id AS TableName,
last_user_scan as LastAction
FROM sys.dm_db_index_usage_stats u
WHERE database_id = db_id(db_name())
UNION
SELECT object_id AS TableName,
last_user_lookup as LastAction
FROM sys.dm_db_index_usage_stats u
WHERE database_id = db_id(db_name())
)
SELECT OBJECT_NAME(so.object_id) AS TableName,
MAX(la.LastAction) as LastSelect
FROM sys.objects so
LEFT
JOIN LastActivity la
on so.object_id = la.ObjectID
WHERE so.type = 'U'
AND so.object_id > 100
GROUP BY OBJECT_NAME(so.object_id)
ORDER BY OBJECT_NAME(so.object_id)
Look in sys.dm_db_index_usage_stats. The columns last_user_xxx will contain the last time the table was accessed from user requests. This table resets its tracking after a server restart, so you must leave it running for a while before relying on its data.
Re: Profiler, if you monitor for SP:StmtCompleted, that will capture all statements executing within a stored procedure, so that will catch table accesses within a sproc. If not everything goes through stored procedures, you may also need the SQL:StmtCompleted event.
There will be a large number of events so it's probably still not practical to trace over a long time due to the size of trace. However, you could apply a filter - e.g. where TextData contains the name of your table you want to check for. You could give a list of table names to filter on at any one time and work through them gradually. So you should not get any trace events if none of those tables have been accessed.
Even if you feel it's not a suitable/viable approach for you, I thought it was worth expanding on.
Another solution would be to do a global search of your source code to find references to the tables. You can query the stored procedure definitions to check for matches for a given table, or just generate a complete database script and do a Find on that for table names.
For SQL Server 2008 you should take a look at SQL Auditing. This allows you to audit many things including selects on a table and reports to a file or Events Log.
The following query uses the query plan cache to see if there's a reference to a table in any of the existing plans in cache. This is not guaranteed to be 100% accurate (since query plans are flushed out if there are memory constraints) but can be used to get some insights on table use.
SELECT schema_name(schema_id) as schemaName, t.name as tableName,
databases.name,
dm_exec_sql_text.text AS TSQL_Text,
dm_exec_query_stats.creation_time,
dm_exec_query_stats.execution_count,
dm_exec_query_stats.total_worker_time AS total_cpu_time,
dm_exec_query_stats.total_elapsed_time,
dm_exec_query_stats.total_logical_reads,
dm_exec_query_stats.total_physical_reads,
dm_exec_query_plan.query_plan
FROM sys.dm_exec_query_stats
CROSS APPLY sys.dm_exec_sql_text(dm_exec_query_stats.plan_handle)
CROSS APPLY sys.dm_exec_query_plan(dm_exec_query_stats.plan_handle)
INNER JOIN sys.databases ON dm_exec_sql_text.dbid = databases.database_id
RIGHT JOIN sys.tables t (NOLOCK) ON cast(dm_exec_query_plan.query_plan as varchar(max)) like '%' + t.name + '%'
I had in mind to play with user permissions for different tables, but then I remembered you can turn on trace with an ON LOGON trigger you might benefit from this:
CREATE OR REPLACE TRIGGER SYS.ON_LOGON_ALL
AFTER LOGON ON DATABASE
WHEN (
USER 'MAX'
)
BEGIN
EXECUTE IMMEDIATE 'ALTER SESSION SET SQL_TRACE TRUE';
--EXECUTE IMMEDIATE 'alter session set events ''10046 trace name context forever level 12''';
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
/
Then you can check your trace files.
This solution works better for me then the solution above. But, is still limted that the server was not re-started as well, but still gives you a good idea of tables not used.
SELECT [name]
,[object_id]
,[principal_id]
,[schema_id]
,[parent_object_id]
,[type]
,[type_desc]
,[create_date]
,[modify_date]
,[is_ms_shipped]
,[is_published]
,[is_schema_published]
FROM [COMTrans].[sys].[all_objects]
where object_id not in (
select object_id from sys.dm_db_index_usage_stats
)
and type='U'
order by name

How do I get a list of tables affected by a set of stored procedures?

I have a huge database with some 100 tables and some 250 stored procedures. I want to know the list of tables affected by a subset of stored procedures. For example, I have a list of 50 stored procedures, out of 250, and I want to know the list of tables that will be affected by these 50 stored procedures. Is there any easy way for doing this, other than reading all the stored procedures and finding the list of tables manually?
PS: I am using SQL Server 2000 and SQL Server 2005 clients for this.
This would be your SQL Server query:
SELECT
[NAME]
FROM
sysobjects
WHERE
xType = 'U' AND --specifies a user table object
id in
(
SELECT
sd.depid
FROM
sysobjects so,
sysdepends sd
WHERE
so.name = 'NameOfStoredProcedure' AND
sd.id = so.id
)
Hope this helps someone.
sp_depends 'StoredProcName' will return the object name and object type that the stored proc depends on.
EDIT: I like #KG's answer better. More flexible IMHO.
I'd do it this way in SQL 2005 (uncomment the "AND" line if you only want it for a particular proc):
SELECT
[Proc] = SCHEMA_NAME(p.schema_id) + '.' + p.name,
[Table] = SCHEMA_NAME(t.schema_id) + '.' + t.name,
[Column] = c.name,
d.is_selected,
d.is_updated
FROM sys.procedures p
INNER JOIN sys.sql_dependencies d
ON d.object_id = p.object_id
AND d.class IN (0,1)
INNER JOIN sys.tables t
ON t.object_id = d.referenced_major_id
INNER JOIN sys.columns c
ON c.object_id = t.object_id
AND c.column_id = d.referenced_minor_id
WHERE p.type IN ('P')
-- AND p.object_id = OBJECT_ID('MyProc')
ORDER BY
1, 2, 3
One very invasive option would be to get a duplicate database and set a trigger on every table that logs that something happened. Then run all the SP's. If you can't do lots of mods to the DB that wont work
Also, be sure to add the logging to existing triggers rather than replace them with logging if you also want tables that the SP's effect via triggers.

Resources