SQL Server - Missing Indexes - What would use the index? - sql-server

I am using SQL Server 2008 and we are using the DMV's to find missing indexes. However, before I create the new index I am trying to figure out what proc/query is wanting that index. I want the most information I can get so I can make informed decision on my indexes. Sometimes the indexes SQL Server wants does not make sense to me. Does anyone know how I can figure out what wants it?

you could try something like this query, which lists the QueryText:
;WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
, CachedPlans AS
(SELECT
RelOp.op.value(N'../../#NodeId', N'int') AS ParentOperationID
,RelOp.op.value(N'#NodeId', N'int') AS OperationID
,RelOp.op.value(N'#PhysicalOp', N'varchar(50)') AS PhysicalOperator
,RelOp.op.value(N'#LogicalOp', N'varchar(50)') AS LogicalOperator
,RelOp.op.value(N'#EstimatedTotalSubtreeCost ', N'float') AS EstimatedCost
,RelOp.op.value(N'#EstimateIO', N'float') AS EstimatedIO
,RelOp.op.value(N'#EstimateCPU', N'float') AS EstimatedCPU
,RelOp.op.value(N'#EstimateRows', N'float') AS EstimatedRows
,cp.plan_handle AS PlanHandle
,qp.query_plan AS QueryPlan
,st.TEXT AS QueryText
,cp.cacheobjtype AS CacheObjectType
,cp.objtype AS ObjectType
,cp.usecounts AS UseCounts
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp
CROSS APPLY qp.query_plan.nodes(N'//RelOp') RelOp (op)
)
SELECT
PlanHandle
,ParentOperationID
,OperationID
,PhysicalOperator
,LogicalOperator
,UseCounts
,CacheObjectType
,ObjectType
,EstimatedCost
,EstimatedIO
,EstimatedCPU
,EstimatedRows
,QueryText
FROM CachedPlans
WHERE CacheObjectType = N'Compiled Plan'
AND PhysicalOperator IN ('nothing will ever match this one!'
--,'Assert'
--,'Bitmap'
--,'Clustered Index Delete'
--,'Clustered Index Insert'
,'Clustered Index Scan'
--,'Clustered Index Seek'
--,'Clustered Index Update'
--,'Compute Scalar'
--,'Concatenation'
--,'Constant Scan'
,'Deleted Scan'
--,'Filter'
--,'Hash Match'
,'Index Scan'
--,'Index Seek'
--,'Index Spool'
,'Inserted Scan'
--,'Merge Join'
--,'Nested Loops'
--,'Parallelism'
,'Parameter Table Scan'
--,'RID Lookup'
--,'Segment'
--,'Sequence Project'
--,'Sort'
--,'Stream Aggregate'
--,'Table Delete'
--,'Table Insert'
,'Table Scan'
--,'Table Spool'
--,'Table Update'
--,'Table-valued function'
--,'Top'
)
just add an ORDER BY on something like the combination of the UseCounts and EstimatedCost.

Here is what finally worked:
with xmlnamespaces(default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') , CachedPlans as (
select
query_plan,
n.value('../../../#StatementText' ,'varchar(1000)') as [Statement],
n.value('../../../#StatementSubTreeCost' ,'varchar(1000)') as [Cost],
n.value('../../../#StatementEstRows' ,'varchar(1000)') as [Rows],
n.value('#Impact' ,'float') as Impact,
n.value('MissingIndex[1]/#Database' ,'varchar(128)') as [Database],
n.value('MissingIndex[1]/#Table' ,'varchar(128)') as [TableName],
(
select dbo.concat(c.value('#Name' ,'varchar(128)'))
from n.nodes('MissingIndex/ColumnGroup[#Usage="EQUALITY"][1]') as t(cg)
cross apply cg.nodes('Column') as r(c)
) as equality_columns,
(
select dbo.concat(c.value('#Name' ,'varchar(128)'))
from n.nodes('MissingIndex/ColumnGroup[#Usage="INEQUALITY"][1]') as t(cg)
cross apply cg.nodes('Column') as r(c)
) as inequality_columns,
(
select dbo.concat(c.value('#Name' ,'varchar(128)'))
from n.nodes('MissingIndex/ColumnGroup[#Usage="INCLUDE"][1]') as t(cg)
cross apply cg.nodes('Column') as r(c)
) as include_columns
from (
select query_plan
from sys.dm_exec_cached_plans p
outer apply sys.dm_exec_query_plan(p.plan_handle) tp
) as tab(query_plan)
cross apply query_plan.nodes('//MissingIndexGroup') as q(n)
)
select *
from CachedPlans

You could run a profiler trace and check out the procedures that are running and their effectiveness in terms on index seeks / usage.
Rather than just do all indices for everyone, it is better to optimize the biggest problem - you usually will get the most benefit from this.
In the profiler trace, figure out which stored proc / tsql statement runs the most number of times and consumes the most resources. Those are the ones that you really want to go after.

Related

Who (Or What) ran this query ID

I'd like to know who (Service account, user account ,etc ) ran each query_id that Query_Store records. Is there a way to do this? I've looked all over and can't seem to find anything.
Basically I'd like the data from this to be stored as well. Hostname, and Login Name are very useful to me.
SELECT sdest.DatabaseName
,sdes.session_id
,sdes.[host_name]
,sdes.[program_name]
,sdes.client_interface_name
,sdes.login_name
,sdes.login_time
,sdes.nt_domain
,sdes.nt_user_name
,sdec.client_net_address
,sdec.local_net_address
,sdest.ObjName
,sdest.Query
FROM sys.dm_exec_sessions AS sdes
INNER JOIN sys.dm_exec_connections AS sdec ON sdec.session_id = sdes.session_id
CROSS APPLY (
SELECT db_name(dbid) AS DatabaseName
,object_id(objectid) AS ObjName
,ISNULL((
SELECT TEXT AS [processing-instruction(definition)]
FROM sys.dm_exec_sql_text(sdec.most_recent_sql_handle)
FOR XML PATH('')
,TYPE
), '') AS Query
FROM sys.dm_exec_sql_text(sdec.most_recent_sql_handle)
) sdest
where sdes.session_id <> ##SPID
--and sdes.nt_user_name = '' -- Put the username here !
ORDER BY sdec.session_id
Credit: Execution datetime for SQL queries against SQL Server
There's no DMV that records which sessions ran which queries. To gather that information you must use an Extended Events trace, or a Database Audit.

Using T-SQL find the Stored Procedure associated to an SSRS Report

I had some help writing this query -- I'm at a bit of a loss because i'm trying to find the the query type or procedure used and i'm not sure what else to add to the query or how to change it.
SELECT
Ds.Name as Data_Source_Name,
C2.Name AS Data_Source_Reference_Name,
C.Name AS Dependent_Item_Name,
C.Path AS Dependent_Item_Path,
ds.*
FROM
ReportServer.dbo.DataSource AS DS
INNER JOIN
ReportServer.dbo.Catalog AS C ON DS.ItemID = C.ItemID
AND DS.Link IN (SELECT ItemID
FROM ReportServer.dbo.Catalog
WHERE Type = 5) --Type 5 identifies data sources
FULL OUTER JOIN
ReportServer.dbo.Catalog C2 ON DS.Link = C2.ItemID
WHERE
C2.Type = 5
AND c.name LIKE '%mkt%'
ORDER BY
C.Path, C2.Name ASC, C.Name ASC;
Please advise.
Based on my comment, give this a try, should get you moving in the right direction on how you can parse the xml and zero in on the specific command.
You might have to update the name spaces in the script below and also add your report name.
But try something like this:
;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition') --You may have to change this based on you SSRS version
SELECT
[Path],
Name,
report_xml.value( '(/Report/DataSources/DataSource/#Name)[1]', 'VARCHAR(50)' ) AS DataSource,
report_xml.value( '(/Report/DataSets/DataSet/Query/CommandText/text())[1]', 'VARCHAR(MAX)' ) AS CommandText,
report_xml.value( '(/Report/DataSets/DataSet/Query/CommandType/text())[1]', 'VARCHAR(100)' ) AS CommandType,
report_xml
FROM
(
SELECT
[Path],
Name,
[Type],
CAST( CAST( content AS VARBINARY(MAX) ) AS XML ) report_xml
FROM dbo.[Catalog]
WHERE Content IS NOT NULL
AND [Type] = 2
) x
WHERE
--use below in where clause if searching for the CommandText. Depending on how the report was developed I would just use the proc name and no brackets or schema.
--Example: if you report was developed as having [dbo].[procName] just use LIKE '%procName%' below. Because other reports could just have dbo.procName.
report_xml.value( '(/Report/DataSets/DataSet/Query/CommandText/text())[1]', 'VARCHAR(MAX)' ) LIKE '%Your Proc Name here%'
--comment out the above and uncomment below if know your report name and want to search for that specific report.
--[x].[Name] = 'The Name Of Your Report'
You're in the right neighborhood... When a report RDL is published its XML is converted into a image data type and stored in dbo.Catalog.Content.
If you convert the image data to VARBINARY(MAX) and then convert to XML, you'll be able to read the XML in plain text.
SELECT TOP (10)
*
FROM
dbo.Catalog c
CROSS APPLY ( VALUES (CONVERT(XML, CONVERT(VARBINARY(MAX), c.Content))) ) cx (content_xml)
WHERE
c.Type = 2;
From there it's just a matter of parsing the XML to dig out what you're looking for. In this case you looking for tags that look like the following...
<DataSet Name="My_stored_proc">
Are you looking for the stored procedure name? If your looking to see where that is at its in the database itself, database > DatabaseName > Programmability > Stored Procedures. If your trying to use the query you built for a report you need to make the stored procedure or change the query type to text and paste it in the box.

Possible to detect cursors (nested cursors) within T-SQL code?

I'm hoping to find a way to sniff out potentially inefficient T-SQL within stored procedures, in this case detecting not just cursors in stored procedures, but preferably nested cursors.
With the script below based on sys.dm_sql_referenced_entities, from a given starting stored procedure I can see a recursive downstream call stack, including a column indicating whether the text CURSOR was found within the procedure definition.
This is helpful, but it isn't capable of telling me:
whether more than one cursor exists within a procedure
whether nested cursors are used (and of course, the truly perfect solution for that would also have to detect a call to stored procedure containing a cursor, from within a cursor)
Being able to do this I think is probably beyond the abilities of querying sys tables, and involves parsing the SQL itself - does anyone know of a technique or tool that could accomplish this, or perhaps an entirely different approach that could tell me the same information.
DECLARE #procname varchar(30)
SET #procname='dbo.some_root_procedure_name'
;WITH CTE([DB],[OBJ],[INDENTED_OBJ],[SCH],[lvl],[indexof_cursor],[referenced_object_definition])
AS
(
SELECT referenced_database_name AS [DB],referenced_entity_name AS [OBJ],
cast(space(0) + referenced_entity_name as varchar(max)) AS [INDENTED_OBJ],
referenced_schema_name AS [SCH],0 AS [lvl]
,charindex('cursor',object_definition(referenced_id)) as indexof_cursor
,object_definition(referenced_id) as [referenced_object_definition]
FROM sys.dm_sql_referenced_entities(#procname, 'OBJECT')
INNER JOIN sys.objects as o on o.object_id=OBJECT_ID(referenced_entity_name)
WHERE o.type IN ('P','FN','IF','TF')
UNION ALL
SELECT referenced_database_name AS [DB],referenced_entity_name AS [OBJ],
cast(space(([lvl]+1)*2) + referenced_entity_name as varchar(max)) AS [INDENTED_OBJ],
referenced_schema_name AS [SCH],[lvl]+1 as [lvl]
,charindex('cursor',object_definition(referenced_id)) as indexof_cursor
,object_definition(referenced_id) as [referenced_object_definition]
FROM CTE as c CROSS APPLY
sys.dm_sql_referenced_entities(c.SCH+'.'+c.OBJ, 'OBJECT') as ref
INNER JOIN sys.objects as o on o.object_id=OBJECT_ID(referenced_entity_name)
WHERE o.type IN ('P','FN','IF','TF') and ref.referenced_entity_name NOT IN (c.OBJ) -- Exit Condition
)
SELECT
*
FROM CTE
EDIT: I am marking this as "solved" even though I think some improvements could be made to the below solution - I think it is "good enough" for most scenarios, but I think a fully recursive solution that can traverse an "infinitely" deep call chain is possible.
Maybe there is a more efficient way, but you could search the procedure code. It's not foolproof though in that it could get some false positives, but you shouldn't miss any. It doesn't ignore comments and variable names so it's quite possible to pick up some extra stuff.
SELECT name, xtype, colid, text
into #CodeBlocks
FROM dbo.sysobjects left join .dbo.syscomments
ON dbo.sysobjects.id = .dbo.syscomments.id
where xtype = 'P'
order by 1
SELECT name,
(SELECT convert(varchar(max),text)
FROM #CodeBlocks t2
WHERE t1.name = t2.name
ORDER BY t2.colid
FOR XML PATH('')
) text
into #AllCode
FROM #CodeBlocks t1
GROUP BY name
select #AllCode.name,
case when InterProc.name is not null then
'Possible Inter-Proc Nesting'
when #AllCode.text like '%CURSOR%FOR%CURSOR%FOR%DEALLOCATE%DEALLOCATE%' then
'Possible Nested Cursor'
when #AllCode.text like '%CURSOR%FOR%CURSOR%FOR%' then
'Possible Multiple Cursor Used'
ELSE
'Possible Cursor Used'
end
from #AllCode
left join #AllCode InterProc
on InterProc.text like '%CURSOR%FOR%'
and #AllCode.text like '%CURSOR%FOR%' + InterProc.name + '%DEALLOCATE%'
where #AllCode.text like '%CURSOR%FOR%'
I found a few nested cursors on our server I didn't know about. Interesting. :)

How to check if a sql variable is declared?

In MS SQL Server. Please don't tell me to ctrl-f. I don't have access to the entire query, but I'm composing the columns that depend on whether certain variable is declared.
Thanks.
Edit:
I'm working with some weird query engine. I need to write the select columns part and the engine will take care of the rest (hopefully). But in some cases this engine will declare variables (thankfully I will know the variable names), and in other cases it doesn't. I need to compose my columns to take these variables when they are declared, and give default values when these variables are not declared.
Given the limitations of my understanding of what you're running (I'm deciphering the word problem to mean that your "query engine" is actually a "query generation engine" so, something like an ORM) you could observe what's occurring on the server in this scenario with the following query:
select
sql_handle,
st.text
from sys.dm_exec_requests r
cross apply sys.dm_exec_sql_text(r.sql_handle) st
where session_id <> ##SPID
and st.text like '%#<<parameter_name>>%';
The statement needs to have begun execution to be able to catch it. Depending on a multitude of situations, you may be able to pull it from query stats, too:
This will also get you the query plan (if it has one), but note that it will also pull stats for itself as well as the query above so you'll need to be discerning when you look at the outer and statement text values:
select
text,
SUBSTRING(
st.text,
(qs.statement_start_offset / 2) + 1,
((CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(st.TEXT)
ELSE qs.statement_end_offset
END -
qs.statement_start_offset) / 2) + 1)
AS statement_text,
plan_generation_num, creation_time, last_execution_time, execution_count
,total_worker_time, last_worker_time, min_worker_time, max_worker_time,
total_physical_reads, min_physical_reads, max_physical_reads, last_physical_reads,
total_logical_writes, min_logical_writes, max_logical_writes, last_logical_writes,
total_logical_reads, min_logical_reads, max_logical_reads, last_logical_reads,
total_elapsed_time, last_elapsed_time, min_elapsed_time, max_elapsed_time,
total_rows,last_rows,min_rows,max_rows
,qp.*
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
outer apply sys.dm_exec_query_plan(qs.plan_handle) qp
where st.text like '%#<<parameter_name>>%';

How can I query how much time a SQL server database restore takes?

Im trying to write a query that will tell me how much time a restore (full or log) has taken on SQL server 2008.
I can run this query to find out how much time the backup took:
select database_name,
[uncompressed_size] = backup_size/1024/1024,
[compressed_size] = compressed_backup_size/1024/1024,
backup_start_date,
backup_finish_date,
datediff(s,backup_start_date,backup_finish_date) as [TimeTaken(s)],
from msdb..backupset b
where type = 'L' -- for log backups
order by b.backup_start_date desc
This query will tell me what is restored but now how much time it took:
select * from msdb..restorehistory
restorehistory has a column backup_set_id which will link to msdb..backupset, but that hold the start and end date for the backup not the restore.
Any idea where to query the start and end time for restores?
To find the RESTORE DATABASE time, I have found that you can use this query:
declare #filepath nvarchar(1000)
SELECT #filepath = cast(value as nvarchar(1000)) FROM [fn_trace_getinfo](NULL)
WHERE [property] = 2 and traceid=1
SELECT *
FROM [fn_trace_gettable](#filepath, DEFAULT)
WHERE TextData LIKE 'RESTORE DATABASE%'
ORDER BY StartTime DESC;
The downside is, you'll notice that, at least on my test server, the EndTime is always NULL.
So, I came up with a second query to try and determine the end time. First of all, I apologize that this is pretty ugly and nested like crazy.
The query below assumes the following:
When a restore is run, for that DatabaseID and ClientProcessID, the next EventSequence contains the TransactionID we need.
I then go and find the max EventSequence for the Transaction
Finally, I select the record that contains RESTORE DATABASE and the maximum transaction associated with that record.
I'm sure someone can probably take what I've done and refine it, but this appears to work on my test environment:
declare #filepath nvarchar(1000)
SELECT #filepath = cast(value as nvarchar(1000)) FROM [fn_trace_getinfo](NULL)
WHERE [property] = 2 and traceid=1
SELECT *
FROM [fn_trace_gettable](#filepath, DEFAULT) F5
INNER JOIN
(
SELECT F4.EventSequence MainSequence,
MAX(F3.EventSequence) MaxEventSequence, F3.TransactionID
FROM [fn_trace_gettable](#filepath, DEFAULT) F3
INNER JOIN
(
SELECT F2.EventSequence, MIN(TransactionID) as TransactionID
FROM [fn_trace_gettable](#filepath, DEFAULT) F1
INNER JOIN
(
SELECT DatabaseID, SPID, StartTime, ClientProcessID, EventSequence
FROM [fn_trace_gettable](#filepath, DEFAULT)
WHERE TextData LIKE 'RESTORE DATABASE%'
) F2 ON F1.DatabaseID = F2.DatabaseID AND F1.SPID = F2.SPID
AND F1.ClientProcessID = F2.ClientProcessID
AND F1.StartTime > F2.StartTime
GROUP BY F2.EventSequence
) F4 ON F3.TransactionID = F4.TransactionID
GROUP BY F3.TransactionID, F4.EventSequence
) F6 ON F5.EventSequence = F6.MainSequence
OR F5.EventSequence = F6.MaxEventSequence
ORDER BY F5.StartTime
EDIT
I made some changes to the query, since one of the test databases I used is case-sensitive and it was losing some records. I also noticed when restoring from disk that the DatabaseID is null, so I'm handling that now as well:
SELECT *
FROM [fn_trace_gettable](#filepath, DEFAULT) F5
INNER JOIN
(
SELECT F4.EventSequence MainSequence,
MAX(F3.EventSequence) MaxEventSequence, F3.TransactionID
FROM [fn_trace_gettable](#filepath, DEFAULT) F3
INNER JOIN
(
SELECT F2.EventSequence, MIN(TransactionID) as TransactionID
FROM [fn_trace_gettable](#filepath, DEFAULT) F1
INNER JOIN
(
SELECT DatabaseID, SPID, StartTime, ClientProcessID, EventSequence
FROM [fn_trace_gettable](#filepath, DEFAULT)
WHERE upper(convert(nvarchar(max), TextData))
LIKE 'RESTORE DATABASE%'
) F2 ON (F1.DatabaseID = F2.DatabaseID OR F2.DatabaseID IS NULL)
AND F1.SPID = F2.SPID
AND F1.ClientProcessID = F2.ClientProcessID
AND F1.StartTime > F2.StartTime
GROUP BY F2.EventSequence
) F4 ON F3.TransactionID = F4.TransactionID
GROUP BY F3.TransactionID, F4.EventSequence
) F6 ON F5.EventSequence = F6.MainSequence
OR F5.EventSequence = F6.MaxEventSequence
ORDER BY F5.StartTime
Make it a Job. Then run it as the Job. Then check the View Job History. Then look at the duration column.
While it is running you can check something like this dmv.
select
d.name
,percent_complete
,dateadd(second,estimated_completion_time/1000, getdate())
, Getdate() as now
,datediff(minute, start_time
, getdate()) as running
, estimated_completion_time/1000/60 as togo
,start_time
, command
from sys.dm_exec_requests req
inner join sys.sysdatabases d on d.dbid = req.database_id
where
req.command LIKE '%RESTORE%'
Or you can use some magic voodoo and interpret the transaction log in the following table function, however the only person I know to understand any info in this log is Paul Randal.
I Know he sometimes checks Server Fault, but don't know if he wonders StackOverflow.
select * from fn_dblog(NULL,NULL)
Hope this helps.
If you manage to use this and find a solution please tell us.
Good Luck!

Resources