Retention period in SQL Server 2008 Change Tracking - sql-server

I'm a little worried about the default retention period in SQL Server 2008 Change Tracking (which is 2 days).
Is it a good idea to set this period to eg. 100 years and turn auto cleanup off or will it bite me back in the future with excessive storage usage and/or performance degradation? Anyone has experience in that matter?

If you set auto cleanup off, it's best to periodically go through and remove the change tracking information yourself, by disabling and then re-enabling change tracking for each table. Otherwise, yes, the tracking data will continue to grow and grow.
You can't query the underlying tables directly, but you can poke at their metadata. The following query shows relative row counts:
select
s.name as schema_name
, t.name as table_name
, (select sum(rows) from sys.partitions x where o.parent_object_id = x.object_id) as rows_in_base_table
, o.name as tracking_table
, p.rows as rows_in_tracking_table
from sys.objects o
join sys.tables t on o.parent_object_id = t.object_id
join sys.schemas s on t.schema_id = s.schema_id
join sys.partitions p on o.object_id = p.object_id
where o.name like 'change[_]tracking%'
and o.schema_id = schema_id('sys')
order by schema_name, table_name
Run that in your database, and you should get a rough sense of current overhead.
The change tracking tables all follow a standard schema. For example:
select
c.name, c.column_id
, type_name(user_type_id) as type_name
, c.max_length, c.precision, c.scale
, c.is_nullable, c.is_identity
from sys.columns c
where object_id = (
select top 1 object_id from sys.objects o
where o.name like 'change[_]tracking%'
and o.schema_id = schema_id('sys')
)
The k_% columns vary by table and correspond to the primary keys of the tracked table. You are looking at a base minimum overhead of 18 bytes + (primary key length) per row. That adds up!
For example, I'm tracking some skinny base tables that are only 15 bytes wide, with a 7-byte composite key. That makes the tracking tables 18+7=25 bytes wide!

Related

Ordinal column position from sys schema

How can I get the following query, working on both views and tables, by using the SYS schema?
SELECT c.column_name,c.ordinal_position
from information_schema.columns c
where TABLE_SCHEMA='dbo'
and table_name='def_transaction_pt'
I have used the following until now, but I don't know where to get the ordinal column position from:
select *
from
sys.objects o
inner join sys.columns c on o.object_id=c.object_id
where o.name='def_transaction_pt'
You are on the right path...
DECLARE #ObjectName sysname = 'def_transaction_pt',
#SchemaName sysname = 'dbo'
SELECT c.name As Column_Name, ROW_NUMBER() OVER(ORDER BY c.column_id) As Ordinal_Position
FROM sys.columns AS c
JOIN sys.objects AS o
ON c.object_id = o.object_id
JOIN sys.schemas AS s
ON o.schema_id = s.schema_id
WHERE o.name = #ObjectName
AND s.name = #SchemaName
Note: I've only joined the sys.schemas to enable filtering using schema name,
and sys.objects to allow filtering using table/view name
After reading the comments and documentation, I've decided to adopt Larnu's suggestion about using row_number instead of the column_id directly.
Also, I've changed All_columns to Columns since it contains the columns from both tables and views.
BTW, the official documentation of information_schema.Columns describes the Ordinal_Position as "Column identification number." - which might be just the same as the Column_Id - It would require more testing to figure out that part.
BTW #2: Though you can't change the column's ordinal position using an alter table, it is possible to do using the visual designer (which in turn, drops and re-creates the table) - and if you do that, the column id for all columns is re-calculated, so it's always corresponding with the actual ordinal position (gaps aside).

Check if any database table has any rows

I am trying to understand how to check if any table in my db has data using entity framework. I can check for one table but how can I check for all tables at once? DO we have any option with ef6?
using (var db = new CreateDbContext())
{
if(!db.FirstTable.Any())
{
// The table is empty
}
}
Any pointers on how to loop through entities would be helpful.
Here is one way you could do this with t-sql. This should be lightning fast on most systems. This returned in less than a second on our ERP database. It stated 421 billion rows in more than 15,000 partition stats.
select sum(p.row_count)
from sys.dm_db_partition_stats p
join sys.objects o on o.object_id = p.object_id
where o.type not in ('S', 'IT') --excludes system and internal tables.
Similar to #SeanLange, but shows schema name and table name for tables without any rows.
SELECT Distinct OBJECT_SCHEMA_NAME(p.object_id) AS [Schema],
OBJECT_NAME(p.object_id) AS [Table]
FROM sys.partitions p
INNER JOIN sys.indexes i
ON p.object_id = i.object_id
AND p.index_id = i.index_id
WHERE OBJECT_SCHEMA_NAME(p.object_id) != 'sys'
And p.Rows = 0
ORDER BY [Schema], [Table]

What happens when a nonclustered index is deleted?

What happens on the SQL Server engine side when I'm deleting an index from one of my tables?
Details: I have a database running into production.
In this database, I have a query that creates deadlocks on a regular basis. I've found the query creating the deadlock, ran it on my computer, showing its execution plan. SQL Server Management Studio proposes to add an index on one specific table.
The index makes sense to me but my problem is that, on this table I already have 3 indexes and, to be honest, I cannot be sure if they're properly used or if they've been created for a specific role.
I could simply add one more index on the table but I'm concerned about the cost I'll pay each time I add/update/delete data on the table.
I made a few attempts on my machine and it seems that I need to delete at least two other indexes to make the engine select the index I'm creating today (looks odd to me). As soon as I force the engine to take my index (because I deleted everything else), the query runs 10 times faster.
Can I simply use the DROP Index command without much problem? I don't have to rebuild or anything?
A non-clustered index is a secondary data structure - it's not "integrated" into the main data structure of the clustered index or heap.
As such, deleting one should be a relatively fast and pain-free experience, since all the system should need to do is remove metadata about the index and mark the index's pages as unallocated.
There shouldn't be a need to "rebuild" or do anything else.
You can run this query to get know whether the indexes of a table are used or not
SELECT TOP 50
o.name AS ObjectName
, i.name AS IndexName
, i.index_id AS IndexID
, dm_ius.user_seeks AS UserSeek
, dm_ius.user_scans AS UserScans
, dm_ius.user_lookups AS UserLookups
, dm_ius.user_updates AS UserUpdates
, p.TableRows
, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.'
+ QUOTENAME(OBJECT_NAME(dm_ius.OBJECT_ID)) AS 'drop statement'
FROM sys.dm_db_index_usage_stats dm_ius
INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id
AND dm_ius.OBJECT_ID = i.OBJECT_ID
INNER JOIN sys.objects o ON dm_ius.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.OBJECT_ID
FROM sys.partitions p GROUP BY p.index_id, p.OBJECT_ID) p
ON p.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(dm_ius.OBJECT_ID,'IsUserTable') = 1 and o.name='your_table_name'
AND dm_ius.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
GO
Then go with the DROP index statement if any of them are not used.
Before dropping, ensure the server has been UP for a long time enough so the indexes got the chance to get used.
Also, having three indexes on a table is not many. Just check they are useful to your queries.

list of tables without indexes in sql 2008

How do I list tables without indexes in my SQL 2008 database?
Edit
I want the Schema name and the Table name.
This should cover what your looking for. i.e. tables that are heaps (no clustered index) and do not have any non-clustered indexes. It uses the new sys. table objects used in 2005/2008.
in addition, you probably want to look for tables that do have a clustered index, but have no nonclustered indexes (this is the 2nd part of the statement which I've left commented out.
SELECT
schemaname = OBJECT_SCHEMA_NAME(o.object_id)
,tablename = o.NAME
FROM sys.objects o
INNER JOIN sys.indexes i ON i.OBJECT_ID = o.OBJECT_ID
-- tables that are heaps without any nonclustered indexes
WHERE (
o.type = 'U'
AND o.OBJECT_ID NOT IN (
SELECT OBJECT_ID
FROM sys.indexes
WHERE index_id > 0
)
)
-- OR
-- table that have a clustered index without any nonclustered indexes
--(o.type='U'
-- AND o.OBJECT_ID NOT IN (
-- SELECT OBJECT_ID
-- FROM sys.indexes
-- WHERE index_id>1))
Here's an example:
select SCHEMA_NAME(schema_id), name from sys.tables
where OBJECTPROPERTY(object_id, 'IsIndexed')= 0
In addition to #Philip Fourie's suggestion you might want to think about which indexes to create.
Once you have been accessing your data, SQL Server 2008 keeps track of places where it thinks indexes will be helpful (it refers to these as "missing indexes." There are a hand full of new Dynamic Managed Views which can show these missing indexes and some info about them.
From MSSQlTips:
sys.dm_db_missing_index_details - Returns detailed information about a missing index
sys.dm_db_missing_index_group_stats - Returns summary information about missing index groups
sys.dm_db_missing_index_groups - Returns information about a specific group of missing indexes
sys.dm_db_missing_index_columns(index_handle) - Returns information about the database table columns that are missing for an index. This is a function and requires the index_handle to be passed.
select shema = s.name, table_name = o.name
from sys.objects o
join sys.schemas s on o.schema_id = s.schema_id
where type = 'U'
and not exists (select i.index_id
from sys.indexes i
where i.type <> 0 --ignore default heap index row
and o.object_id = i.object_id )
Edit:
I have updated the SQL to include the Schema name as requested. (Note I had to sys.objects instead of sysobjects to cater for schemas that were introduced in SQL 2005)
The catalog tables are documented in the SQL Server documentation, see this link.
This FAQ contains more samples and might also be useful.
Note that these are system tables and can change between SQL server versions, where possible rather use the system table-independent views called Information Schema Views.
This code gives all the details about the indexes for all the tables:
SELECT
sch.name AS [Schema],
obj.name AS TableName,
indx.name AS IndexName,
CASE
WHEN indx.type_desc = 'HEAP' THEN 'N/A'
ELSE indx.type_desc
END AS IndexType
FROM sys.objects obj
JOIN sys.indexes indx ON indx.object_id = obj.object_id
JOIN sys.schemas AS sch ON sch.schema_id = obj.schema_id
WHERE
obj.type = 'U'
ORDER BY
obj.name

How do I get a list of tables affected by a set of stored procedures?

I have a huge database with some 100 tables and some 250 stored procedures. I want to know the list of tables affected by a subset of stored procedures. For example, I have a list of 50 stored procedures, out of 250, and I want to know the list of tables that will be affected by these 50 stored procedures. Is there any easy way for doing this, other than reading all the stored procedures and finding the list of tables manually?
PS: I am using SQL Server 2000 and SQL Server 2005 clients for this.
This would be your SQL Server query:
SELECT
[NAME]
FROM
sysobjects
WHERE
xType = 'U' AND --specifies a user table object
id in
(
SELECT
sd.depid
FROM
sysobjects so,
sysdepends sd
WHERE
so.name = 'NameOfStoredProcedure' AND
sd.id = so.id
)
Hope this helps someone.
sp_depends 'StoredProcName' will return the object name and object type that the stored proc depends on.
EDIT: I like #KG's answer better. More flexible IMHO.
I'd do it this way in SQL 2005 (uncomment the "AND" line if you only want it for a particular proc):
SELECT
[Proc] = SCHEMA_NAME(p.schema_id) + '.' + p.name,
[Table] = SCHEMA_NAME(t.schema_id) + '.' + t.name,
[Column] = c.name,
d.is_selected,
d.is_updated
FROM sys.procedures p
INNER JOIN sys.sql_dependencies d
ON d.object_id = p.object_id
AND d.class IN (0,1)
INNER JOIN sys.tables t
ON t.object_id = d.referenced_major_id
INNER JOIN sys.columns c
ON c.object_id = t.object_id
AND c.column_id = d.referenced_minor_id
WHERE p.type IN ('P')
-- AND p.object_id = OBJECT_ID('MyProc')
ORDER BY
1, 2, 3
One very invasive option would be to get a duplicate database and set a trigger on every table that logs that something happened. Then run all the SP's. If you can't do lots of mods to the DB that wont work
Also, be sure to add the logging to existing triggers rather than replace them with logging if you also want tables that the SP's effect via triggers.

Resources