How to check blocking queries in SQL Server - sql-server

I have one warehouse server which got data/sync from legacy system 24/7, I noticed some of my reports/sql jobs performance is uncertain and most of the time I heard from DBA team that my query is blocking to other sync process.
From DBA team I came to know command i.e. EXEC SP_WHO2 by which I can identify spid of query which cause blocking by looking into column BlkBy.
Please suggest me how I can avoid blocking and other ways to check blocking in SQL Server

Apart from Sp_Who2 you can use following query to identify blocking in you SQL.
SELECT
db.name DBName,
tl.request_session_id,
wt.blocking_session_id,
OBJECT_NAME(p.OBJECT_ID) BlockedObjectName,
tl.resource_type,
h1.TEXT AS RequestingText,
h2.TEXT AS BlockingTest,
tl.request_mode
FROM sys.dm_tran_locks AS tl
INNER JOIN sys.databases db ON db.database_id = tl.resource_database_id
INNER JOIN sys.dm_os_waiting_tasks AS wt ON tl.lock_owner_address = wt.resource_address
INNER JOIN sys.partitions AS p ON p.hobt_id = tl.resource_associated_entity_id
INNER JOIN sys.dm_exec_connections ec1 ON ec1.session_id = tl.request_session_id
INNER JOIN sys.dm_exec_connections ec2 ON ec2.session_id = wt.blocking_session_id
CROSS APPLY sys.dm_exec_sql_text(ec1.most_recent_sql_handle) AS h1
CROSS APPLY sys.dm_exec_sql_text(ec2.most_recent_sql_handle) AS h2
GO
Also can check detail of particular SPID by using following command.
DBCC INPUTBUFFER(56) — Will give you the Event Info.
KILL 56 -- Will kill the session of this id.

This is a very comprehensive guide. Some basic guidelines though:
Avoid SELECT ... INTO #temp pattern and instead create a table first and use INSERT INTO #Temp SELECT...
Use WITH (NOLOCK) on queries where you can tolerate dirty reads
Ensure proper indexes exist
Use sargable predicates in your WHERE clauses
Talk to your DBA about potentially enabling READ_COMMITTED_SNAPSHOT isolation level

The simplest method is by using the Activity Monitor query within Microsoft’s SQL Server Management Studio (SSMS). To access this query from SSMS: first open up the main window; then click ‘Activity Monitor’ under ‘Tools’; then use either the ‘Processes/Sessions’ tab or specifically select ‘Blocking Processes” from the drop down menu at top left of the monitor window. This will show all currently running processes and their associated session ID's, as well as any transactions they might be involved with such as those that are being blocked by other threads.
You can also check for blocking using a few T-SQL scripts designed explicitly to check locking behavior on working systems. One such script is called SP_WHO2 this simple system-stored procedure displays lock information about active user connections and associated process IDs against all databases running on an instance of SQL server. --Cheers Mike B

Related

Golang SQL: More Efficient to Query Two Tables at Once or Separate Queries/Connection Pools?

I have a connection pool for database A and database B. I am moving some Node.JS code over to Go (I'm using SQL Server if that matters), and some of the queries are doing this:
db.A.Query(`
select ... from some_table;
select ... from B..other_table;
`)
Is it better to do it that way, or like:
db.A.Query(...)
db.B.Query(...)
I read this line:
create one sql.DB object for each distinct datastore you need to access
from here. And only now do I realize I read 'datastore' as 'database', so now I'm not even sure if it's efficient to have these two database connection pools!
Thank you for any help!
For most scenarios and SQL Server client programs sending multiple SELECT queries in a batch is not materially more efficient. Perhaps if the queries returned very small result sets, and you ran them at very high frequency, you could see a material difference. But in the paradigm case, whether you send the queries in one or two batches won't matter much.
It won't matter to SQL Server at all, so the only difference will be in the client/server network traffic.
SSMS will let you compare the client statistics between running queries in a one-batch script and a multi-batch script. EG running
select top 10 * from sys.objects
select top 5 * from sys.columns
and then
select top 10 * from sys.objects
GO
select top 5 * from sys.columns
in SSMS outputs the following client statistics

Detect the cause of SQL Server update lock

Problem:
A .NET application during business transaction executes a query like
UPDATE Order
SET Description = 'some new description`
WHERE OrderId = #p1 AND RowVersion = #p2
This query hangs until timeout (several minutes) and then I get an exception:
SqlException: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
It is reproduced, when database is under heavy load (several times per day).
I need to detect the cause of the lock of the query.
What I've tried:
Exploring activity monitor - it shows that the query is hanging by lock. Filtering by headblocker does not give much, it is frequently changing.
Analyze SQL script, that gives similar to activity monitor data - almost same result as looking to activity monitor. Chasing blocking_session_id results in some session, that awaits for command or executing some SQL, I can't reason a relation to Order table. Executing the same script in a second gives other session. I also tried a some other queries/stored procedures from this atritcle with no result.
Building standard SQL Server report for locked/problem transactions results in errors like Max recursion exhausted or Local OutOfMemory Exception (I have 16 Gb RAM).
Database Details
Version: SQL Server 2016
Approximate number of concurrent queries per second by apps to database: 400
Database size: 1.5 Tb
Transaction isolation level: ReadUncommited for readonly transactions, Serializable for transactions with modifications
I'm absolutely new to this kind of problems, so I have missed a lot for sure.
Any help or direction would be great!
In case anyone interested, I have found this particular query espesially usefull:
SELECT tl.resource_type
,OBJECT_NAME(p.object_id) AS object_name
,tl.request_status
,tl.request_mode
,tl.request_session_id
,tl.resource_description
,(select text from sys.dm_exec_sql_text(r.sql_handle))
FROM sys.dm_tran_locks tl
INNER JOIN sys.dm_exec_requests r ON tl.request_session_id=r.session_id
LEFT JOIN sys.partitions p ON p.hobt_id = tl.resource_associated_entity_id
WHERE tl.resource_database_id = DB_ID()
AND OBJECT_NAME(p.object_id) = '<YourTableName>'
ORDER BY tl.request_session_id
It shows transactions, that have acquired locks on <YourTableName> and what query they are executing now.
Try to use sys.dm_exec_requests view, and filter by columns blocking_session_id, wait_time

SQLAlchemy: Multiple databases (on the same server) in a single session?

I'm running MS SQL Server and am trying to perform a JOIN between two tables located in different databases (on the same server). If I connect to the server using pyodbc (without specifying a database), then the following raw SQL works fine.
SELECT * FROM DatabaseA.dbo.tableA tblA
INNER JOIN DatabaseB.dbo.tableB tblB
ON tblA.id = tblB.id
Unfortunately, I just can't seem to get the analog to work using SQLAlchemy. I've seen this topic touched on in a few places:
Is there a way to perform a join across multiple sessions in sqlalchemy?
Cross database join in sqlalchemy
How do I connect to multiple databases on the same SQL Server with sqlalchemy?
How can I use multiple databases in the same request in Cherrypy and SQLAlchemy?
Most recommend to use different engines / sessions, but I crucially need to perform joins between the databases, so I don't think this approach will be helpful. Another typical suggestion is to use the schema parameter, but this does not seem to work for me. For example the following does not work.
engine = create_engine('mssql+pyodbc://...') #Does not specify database
metadataA = MetaData(bind=engine, schema='DatabaseA.dbo', reflect=True)
tableA = Table('tableA', metadataA, autoload=True)
metadataB = MetaData(bind=engine, schema='DatabaseB.dbo', reflect=True)
tableB = Table('tableB', metadataB, autoload=True)
I've also tried varients where schema='DatabaseA' and schema='dbo'. In all cases SQLAlchemy throws a NoSuchTableError for both tables A and B. Any ideas?
If you can create a synonym in one of the databases, you can keep your query local to that single database.
USE DatabaseB;
GO
CREATE SYNONYM dbo.DbA_TblA FOR DatabaseA.dbo.tableA;
GO
Your query then becomes:
SELECT * FROM dbo.DbA_TblA tblA
INNER JOIN dbo.tableB tblB
ON tblA.id = tblB.id
I'm able to run a test just like this here, reflecting from two remote databases, and it works fine.
Using a recent SQLAlchemy (0.8.3 recommended at least)?
turn on "echo='debug'" - what tables is it finding?
after the reflect all, what's present in metadataA.tables metadataB.tables?
is the casing here exactly what's on SQL server ? (e.g. tableA). Using a case sensitive name like that will cause it to be quoted.

SQL Server COMPILE locks?

SQL Server 2000 here.
I'm trying to be an interim DBA, but don't know much about the mechanics of a database server, so I'm getting a little stuck. There's a client process that hits three views simultaneously. These three views query a remote server to pull back data.
What it looks like is that one of these queries will work, but the other two fail (client process says it times out, so I'm guessing a lock can do that). The querying process has a lock that sticks around until the SQL process is restarted (I got gutsy and tried to kill the spid once, but it wouldn't let go). Any queries to this database after the lock hang, and blame the first process for blocking it.
The process reports these locks... (apologies for the formatting, the preview functionality shows it as fully lined up).
spid dbid ObjId IndId Type Resource Mode Status
53 17 0 0 DB S GRANT
53 17 1445580188 0 TAB Sch-S GRANT
53 17 1445580188 0 TAB [COMPILE] X GRANT
I can't analyze that too well. Object 1445580188 is sp_bindefault, a system stored procedure in master. What's it hanging on to an exclusive lock for?
View code, to protect the proprietary...I only changed the names (they stayed consistent with aliases and whatnot) and tried to keep everything else exactly the same.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO
ALTER view [dbo].[theView]
as
select
a.[column1] column_1
,b.[column2] column_2
,[column3]
,[column4]
,[column5]
,[column6]
,[column7]
,[column8]
,[column9]
,[column10]
,p.[column11]
,p.[column12]
FROM
[remoteServer].db1.dbo.[tableP] p
join [remoteServer].db2.dbo.tableA a on p.id2 = a.id
join [remoteServer].db2.dbo.tableB b on p.id = b.p_id
WHERE
isnumeric(b.code) = 1
GO
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER ON
GO
Take a look at this link. Are you sure it's views that are blocking and not stored procedures? To find out, run this query below with the ObjId from your table above. There are things that you can do to mitigate stored procedure re-compiles. The biggest thing is to avoid naming your stored procedures with an "sp_" prefix, see this article on page 10. Also avoid using if/else branches in the code, use where clauses with case statements instead. I hope this helps.
[Edit]:
I believe sp_binddefault/rule is used in conjunction with user defined types (UDT). Does your view make reference to any UDT's?
SELECT * FROM sys.Objects where object_id = 1445580188
Object 1445580188 is sp_bindefault in the master database, no? Also, it shows resource = "TAB" = table.
USE master
SELECT OBJECT_NAME(1445580188), OBJECT_ID('sp_bindefault')
USE mydb
SELECT OBJECT_NAME(1445580188)
If the 2nd query returns NULL, then the object is a work table.
I'm guessing it's a work table being generated to deal with the results locally.
The JOIN will happen locally and all data must be pulled across.
Now, I can't shed light on the compile lock: the view should be compiled already. This is complicated by the remote server access and my experience of compile locks is all related to stored procs.

CPU utilization by database?

Is it possible to get a breakdown of CPU utilization by database?
I'm ideally looking for a Task Manager type interface for SQL server, but instead of looking at the CPU utilization of each PID (like taskmgr) or each SPID (like spwho2k5), I want to view the total CPU utilization of each database. Assume a single SQL instance.
I realize that tools could be written to collect this data and report on it, but I'm wondering if there is any tool that lets me see a live view of which databases are contributing most to the sqlservr.exe CPU load.
Sort of. Check this query out:
SELECT total_worker_time/execution_count AS AvgCPU
, total_worker_time AS TotalCPU
, total_elapsed_time/execution_count AS AvgDuration
, total_elapsed_time AS TotalDuration
, (total_logical_reads+total_physical_reads)/execution_count AS AvgReads
, (total_logical_reads+total_physical_reads) AS TotalReads
, execution_count
, SUBSTRING(st.TEXT, (qs.statement_start_offset/2)+1
, ((CASE qs.statement_end_offset WHEN -1 THEN datalength(st.TEXT)
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2) + 1) AS txt
, query_plan
FROM sys.dm_exec_query_stats AS qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) AS st
cross apply sys.dm_exec_query_plan (qs.plan_handle) AS qp
ORDER BY 1 DESC
This will get you the queries in the plan cache in order of how much CPU they've used up. You can run this periodically, like in a SQL Agent job, and insert the results into a table to make sure the data persists beyond reboots.
When you read the results, you'll probably realize why we can't correlate that data directly back to an individual database. First, a single query can also hide its true database parent by doing tricks like this:
USE msdb
DECLARE #StringToExecute VARCHAR(1000)
SET #StringToExecute = 'SELECT * FROM AdventureWorks.dbo.ErrorLog'
EXEC #StringToExecute
The query would be executed in MSDB, but it would poll results from AdventureWorks. Where should we assign the CPU consumption?
It gets worse when you:
Join between multiple databases
Run a transaction in multiple databases, and the locking effort spans multiple databases
Run SQL Agent jobs in MSDB that "work" in MSDB, but back up individual databases
It goes on and on. That's why it makes sense to performance tune at the query level instead of the database level.
In SQL Server 2008R2, Microsoft introduced performance management and app management features that will let us package a single database in a distributable and deployable DAC pack, and they're promising features to make it easier to manage performance of individual databases and their applications. It still doesn't do what you're looking for, though.
For more of those, check out the T-SQL repository at Toad World's SQL Server wiki (formerly at SQLServerPedia).
Updated on 1/29 to include total numbers instead of just averages.
SQL Server (starting with 2000) will install performance counters (viewable from Performance Monitor or Perfmon).
One of the counter categories (from a SQL Server 2005 install is:)
- SQLServer:Databases
With one instance for each database. The counters available however do not provide a CPU % Utilization counter or something similar, although there are some rate counters, that you could use to get a good estimate of CPU. Example would be, if you have 2 databases, and the rate measured is 20 transactions/sec on database A and 80 trans/sec on database B --- then you would know that A contributes roughly to 20% of the total CPU, and B contributes to other 80%.
There are some flaws here, as that's assuming all the work being done is CPU bound, which of course with databases it's not. But that would be a start I believe.
Here's a query that will show the actual database causing high load. It relies on the query cache which might get flushed frequently in low-memory scenarios (making the query less useful).
select dbs.name, cacheobjtype, total_cpu_time, total_execution_count from
(select top 10
sum(qs.total_worker_time) as total_cpu_time,
sum(qs.execution_count) as total_execution_count,
count(*) as number_of_statements,
qs.plan_handle
from
sys.dm_exec_query_stats qs
group by qs.plan_handle
order by sum(qs.total_worker_time) desc
) a
inner join
(SELECT plan_handle, pvt.dbid, cacheobjtype
FROM (
SELECT plan_handle, epa.attribute, epa.value, cacheobjtype
FROM sys.dm_exec_cached_plans
OUTER APPLY sys.dm_exec_plan_attributes(plan_handle) AS epa
/* WHERE cacheobjtype = 'Compiled Plan' AND objtype = 'adhoc' */) AS ecpa
PIVOT (MAX(ecpa.value) FOR ecpa.attribute IN ("dbid", "sql_handle")) AS pvt
) b on a.plan_handle = b.plan_handle
inner join sys.databases dbs on dbid = dbs.database_id
I think the answer to your question is no.
The issue is that one activity on a machine can cause load on multiple databases. If I have a process that is reading from a config DB, logging to a logging DB, and moving transactions in and out of various DBs based on type, how do I partition the CPU usage?
You could divide CPU utilization by the transaction load, but that is again a rough metric that may mislead you. How would you divide transaction log shipping from one DB to another, for instance? Is the CPU load in the reading or the writing?
You're better off looking at the transaction rate for a machine and the CPU load it causes. You could also profile stored procedures and see if any of them are taking an inordinate amount of time; however, this won't get you the answer you want.
With all said above in mind.
Starting with SQL Server 2012 (may be 2008 ?) , there is column database_id in sys.dm_exec_sessions.
It gives us easy calculation of cpu for each database for currently connected sessions. If session have disconnected, then its results have gone.
select session_id, cpu_time, program_name, login_name, database_id
from sys.dm_exec_sessions
where session_id > 50;
select sum(cpu_time)/1000 as cpu_seconds, database_id
from sys.dm_exec_sessions
group by database_id
order by cpu_seconds desc;
Take a look at SQL Sentry. It does all you need and more.
Regards,
Lieven
Have you looked at SQL profiler?
Take the standard "T-SQL" or "Stored Procedure" template, tweak the fields to group by the database ID (I think you have to used the number, you dont get the database name, but it's easy to find out using exec sp_databases to get the list)
Run this for a while and you'll get the total CPU counts / Disk IO / Wait etc. This can give you the proportion of CPU used by each database.
If you monitor the PerfMon counter at the same time (log the data to a SQL database), and do the same for the SQL Profiler (log to database), you may be able to correlate the two together.
Even so, it should give you enough of a clue as to which DB is worth looking at in more detail. Then, do the same again with just that database ID and look for the most expensive SQL / Stored Procedures.
please check this query:
SELECT
DB_NAME(st.dbid) AS DatabaseName
,OBJECT_SCHEMA_NAME(st.objectid,dbid) AS SchemaName
,cp.objtype AS ObjectType
,OBJECT_NAME(st.objectid,dbid) AS Objects
,MAX(cp.usecounts)AS Total_Execution_count
,SUM(qs.total_worker_time) AS Total_CPU_Time
,SUM(qs.total_worker_time) / (max(cp.usecounts) * 1.0) AS Avg_CPU_Time
FROM sys.dm_exec_cached_plans cp
INNER JOIN sys.dm_exec_query_stats qs
ON cp.plan_handle = qs.plan_handle
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
WHERE DB_NAME(st.dbid) IS NOT NULL
GROUP BY DB_NAME(st.dbid),OBJECT_SCHEMA_NAME(objectid,st.dbid),cp.objtype,OBJECT_NAME(objectid,st.dbid)
ORDER BY sum(qs.total_worker_time) desc

Resources