I'm looking a way to find out if there are uncommited statements in past sessions.
I have already checked in the current session V$TRANSACTION but there is nothing.
I found out that the first troubleshooting was produced at 2018-06-29 13:35:07.236 using this request:
SELECT * FROM DBA_HIST_ACTIVE_SESS_HISTORY
where
event ='enq: TX - row lock contention' and
sample_time > ({ts '2018-06-29 12:41:09'})
order by sample_time ASC
Is there a way to find out the session id and the user id which used uncommited transactions?
I know how to detect in the current sessions but no in past session.
If I understand what you are trying to do, there is no need to go looking for uncommitted transactions. DBA_HIST_ACTIVE_SESS_HISTORY has a BLOCKING_SESSION column that will tell you what the session was waiting on at that time.
In the case of a session waiting on enq: TX - row lock contention, the blocking session should be the session that was holding the lock (i.e., the session in the middle of the "uncommitted transaction" you were looking for).
To get the details of sessions with their blockers, do something like this:
SELECT s.*, blk.*
FROM dba_hist_active_sess_history s
-- Add this join to get blocking session
INNER JOIN dba_hist_active_sess_history blk
ON blk.session_id = s.blocking_session
AND blk.dbid = s.dbid
AND blk.snap_id = s.snap_id
AND blk.instance_number = s.instance_number
AND blk.sample_id = s.sample_id
When I use your request, I don't find the root cause:
SELECT s.*, blk.*
FROM dba_hist_active_sess_history s
INNER JOIN dba_hist_active_sess_history blk
ON blk.session_id = s.blocking_session
AND blk.dbid = s.dbid
AND blk.snap_id = s.snap_id
AND blk.instance_number = s.instance_number
AND blk.sample_id = s.sample_id
where
s.event ='enq: TX - row lock contention' and
s.sample_time > ({ts '2018-06-29 12:41:09'})
order by s.sample_time ASC
It displays 5781 (2018-06-29 13:37:10.006) as blocking_session. And then with your join, it indicates that session 5781 is blocked by session 1907 (2018-06-29 13:37:10.006).
Do you know how to find the root blocking session id and its SQL_ID because it's required to search in Java code a non-closed transaction that caused this blocking session.
Related
I'm running SQL Server 2019 Always ON Availability Group with an asynchronous replication.
I use a free tool called IDERA SQL Check and I have spotted the SPID 69 which program name is Replication Distribution Agent. It's always there, staring at me like a bored cat.
This SPID 69 is pointing to a specific database which is mirrored I investigated it with this the query:
select
s.session_id
,login_name
,login_time
,host_name
,program_name
,status
,cpu_time
,memory_usage
,total_scheduled_time
,total_elapsed_time
,last_request_start_time
,reads
,writes
,logical_reads
from sys.dm_exec_sessions s
inner join sys.dm_exec_connections c
on s.session_id = c.session_id
outer apply sys.dm_exec_sql_text(c.most_recent_sql_handle) st
where s.is_user_process = 1
and s.open_transaction_count > 0;
Which gave me this response:
session_id = 69
text = begin tran
login_time = 2020-09-08 18:40:57.153
program_name = Replication Distribution Agent
status = sleeping
cpu_time = 1362772
memory_usage = 4
total_scheduled_time = 1689634
total_elapsed_time = 22354857
last_request_start_time = 2020-09-28 16:28:39.433
reads = 18607577
writes = 5166597
logical_reads = 112256365
Now, on internet I find that when you see Replication Distribution Agent is all good, that agent should be going and there should be no problem. But why:
The text says begin tran and nothing more?
IDERA SQL Check is labelling it as connection idling transaction?
The status is sleeping?
I'm concerned that CPU time, reads and writes are basically telling me that this process is frying the drive with never ending I/O, am I right?
This is perfectly normal.
The replication distribution agent is effectively running continuously to scan the transactions on your source to be able to send them to the replicas. Because it needs to capture these and forward them, it has to run continuously.
It is not frying your drive - unless your transaction rate is so high that that is actually frying your drive. It shows high reads in an incremental manner - this is cumulative values and not a snapshot of current. That suggests that it has read the equivalent of 141GB over 20 days - not particularly heavy use.
AWS has a required SSL certificate update for it's RDS instances going out on the 5th. Even though I do not actually use the certificate I went ahead and ran the update so it was done and I wouldn't have any unexpected downtime. Only the SSL Cert should have been updated as I understand it.
Instead my CPU usage went from less than 10% while idle to over 80%. Now I've isolated the cause of this to a query we run every few seconds to retrieve a list of recent transactions. And with some tweaking the CPU usage has returned to normal levels.
But this query has been in place for a few years without issues and it's only after this SSL update that it's caused us any grief. My concern is there is some deeper issue behind the scenes and that changing the query is merely treating a symptom. Before revising the query, I ran all pending updates and rebooted the database with no changes. There was also one other person on the AWS Forums with the same issue but neither of us were able to get any useful responses. Thankfully the rest of the system seems to be behaving itself but I want to know what's going on.
In case it can help identify why a query would suddenly use far more resources here is a (simplified) version of the query prior to my tweak.
SELECT Distinct Top (#NumOfTrx) [trx].* ,[c].*
FROM [dbo].[TRX_Transactions] trx
Inner Join #ProdSelection s on ((trx.Code = s.ID and s.ID != 0) or (s.ID = 0 and (trx.TypeID = 1or StatusID = 4))
or (BatchId > 0 and s.ID in (select b.Code from [dbo].[TRX_Batch] b where b.BatchId = trx.BatchId)))
Join CLI_Details c on trx.UserName = c.UserName
where trx.TransactionDate > DATEADD(Day, -1, GETDATE()) and (trx.Amount >= #Size Or trx.TypeID = 1 or StatusID = 4)
And (#Company = 0 or c.Company = #Company) and (#Agent = '' or [Agent] = #Agent)
order by [trx].[TransactionDate] Desc
Removing the Prod selection join, a filter that is a list of ids that we can operate without for the time being, was what resolved the issue.
I am using Dapper on ADO.NET. So at present I am doing the following:
using (IDbConnection conn = new SqlConnection("MyConnectionString")))
{
conn.Open());
using (IDbTransaction transaction = conn.BeginTransaction())
{
// ...
However, there are various levels of transactions that can be set. I think this is the various settings.
My first question is how do I set the transaction level (where I am using Dapper)?
My second question is what is the correct level for each of the following cases? In each of these cases we have multiple instances of a web worker (Azure) service running that will be hitting the DB at the same time.
I need to run monthly charges on subscriptions. So in a transaction I need to read a record and if it's due for a charge create the invoice record and mark the record as processed. Any other read of that record for the same purpose needs to fail. But any other reads of that record that are just using it to verify that it is active need to succeed.
So what transaction do I use for the access that will be updating the processed column? And what transaction do I use for the other access that just needs to verify that the record is active?
In this case it's fine if a conflict causes the charge to not be run (we'll get it the next day). But it is critical that we not charge someone twice. And it is critical that the read to verify that the record is active succeed immediately while the other operation is in its transaction.
I need to update a record where I am setting just a couple of columns. One use case is I set a new password hash for a user record. It's fine if other access occurs during this except for deleting the record (I think that's the only problem use case). If another web service is also updating that's the user's problem for doing this in 2 places simultaneously.
But it's key that the record stay consistent. And this includes the use case of "set NumUses = NumUses + #ParamNum" so it needs to treat the read, calculation, write of the column value as an atomic action. And if I am setting 3 column values, they all get written together.
1) Assuming that Invoicing process is an SP with multiple statements your best bet is to create another "lock" table to store the fact that invoicing job is already running e.g.
CREATE TABLE InvoicingJob( JobStarted DATETIME, IsRunning BIT NOT NULL )
-- Table will only ever have one record
INSERT INTO InvoicingJob
SELECT NULL, 0
EXEC InvoicingProcess
ALTER PROCEDURE InvoicingProcess
AS
BEGIN
DECLARE #InvoicingJob TABLE( IsRunning BIT )
-- Try to aquire lock
UPDATE InvoicingJob WITH( TABLOCK )
SET JobStarted = GETDATE(), IsRunning = 1
OUTPUT INSERTED.IsRunning INTO #InvoicingJob( IsRunning )
WHERE IsRunning = 0
-- job has been running for more than a day i.e. likely crashed without releasing a lock
-- OR ( IsRunning = 1 AND JobStarted <= DATEADD( DAY, -1, GETDATE())
IF NOT EXISTS( SELECT * FROM #InvoicingJob )
BEGIN
PRINT 'Another Job is already running'
RETURN
END
ELSE
RAISERROR( 'Start Job', 0, 0 ) WITH NOWAIT
-- Do invoicing tasks
WAITFOR DELAY '00:01:00' -- to simulate execution time
-- Release lock
UPDATE InvoicingJob
SET IsRunning = 0
END
2) Read about how transactions work: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/transactions-transact-sql?view=sql-server-2017
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017
You second question is quite broad.
Can anyone help me with the steps to decode the exact culprit of the deadlock when wait resource is RID and also how to remove that deadlock?
you can use Sp who & sp who2 System Stored Procedures
sp_who
Sp_who returns information about system and user activity.
Sp_who returns the following columns: Spid System process id that
requested the lock Ecid Execution context of the thread associated
with the spid. Zero means the main thread, all other numbers mean
sub-threads. Status Runnable, sleeping, or background. If the status
is runnable that means the process is actually performing work,
sleeping means the process is connected to the server, but is idle at
the moment. Loginname The login that has initiated the lock request
Hostname The name of the computer where the lock request was initiated
Blk The connection that is blocking the lock request from the current
connection Dbname Database name where the lock has been requested
Cmd General command type that requested the lock The syntax of sp_who
also allows specifying a single login, however, most of the time it
will be executed with no parameters. The output of sp_who is very
similar to the output of sp_who2.
sp_who2
Sp_who2 is a newer version of sp_who. It returns some
additional information: Spid System process id that requested the
lock Status Background, sleeping or runnable Login The login name that
has requested the lock HostName The computer where the lock request
has been initiated BlkBy The spid of the connection that is blocking
the current connection DbName The database name where the lock request
has been generated Command General command type that requested the
lock CPUTime The number of milliseconds the request has used
DiskIO Disk input / output that the command has used LastBatch Date
and time of the last batch executed by the connection ProgramName The
name of the application that issued the connection Spid In case you
can't read the spid from the beginning of the output it is repeated
here Example: Sp_who2
Blocking details :
--============================================
--View Blocking in Current Database
--Author: Timothy Ford
--http://thesqlagentman.com
--============================================
SELECT DTL.resource_type,
CASE
WHEN DTL.resource_type IN ('DATABASE', 'FILE', 'METADATA') THEN DTL.resource_type
WHEN DTL.resource_type = 'OBJECT' THEN OBJECT_NAME(DTL.resource_associated_entity_id)
WHEN DTL.resource_type IN ('KEY', 'PAGE', 'RID') THEN
(
SELECT OBJECT_NAME([object_id])
FROM sys.partitions
WHERE sys.partitions.hobt_id =
DTL.resource_associated_entity_id
)
ELSE 'Unidentified'
END AS requested_object_name, DTL.request_mode, DTL.request_status,
DOWT.wait_duration_ms, DOWT.wait_type, DOWT.session_id AS [blocked_session_id],
sp_blocked.[loginame] AS [blocked_user], DEST_blocked.[text] AS [blocked_command],
DOWT.blocking_session_id, sp_blocking.[loginame] AS [blocking_user],
DEST_blocking.[text] AS [blocking_command], DOWT.resource_description
FROM sys.dm_tran_locks DTL
INNER JOIN sys.dm_os_waiting_tasks DOWT
ON DTL.lock_owner_address = DOWT.resource_address
INNER JOIN sys.sysprocesses sp_blocked
ON DOWT.[session_id] = sp_blocked.[spid]
INNER JOIN sys.sysprocesses sp_blocking
ON DOWT.[blocking_session_id] = sp_blocking.[spid]
CROSS APPLY sys.[dm_exec_sql_text](sp_blocked.[sql_handle]) AS DEST_blocked
CROSS APPLY sys.[dm_exec_sql_text](sp_blocking.[sql_handle]) AS DEST_blocking
WHERE DTL.[resource_database_id] = DB_ID()
sp-who-to-find-dead-locks-in-SQL-Server
different-techniques-to-identify-blocking-in-sql-server
understanding-sql-server-blocking
I have this open transaction, according to DBCC OPENTRAN:
Oldest active transaction:
SPID (server process ID) : 54
UID (user ID) : -1
Name : UPDATE
LSN : (4196:12146:1)
Start time : Jul 20 2011 12:44:23:590PM
SID : 0x01
Is there a way to kill it/ roll it back?
You should first figure out what it was doing, where it came from, and if applicable how much longer it might be expected to run:
SELECT
r.[session_id],
c.[client_net_address],
s.[host_name],
c.[connect_time],
[request_start_time] = s.[last_request_start_time],
[current_time] = CURRENT_TIMESTAMP,
r.[percent_complete],
[estimated_finish_time] = DATEADD
(
MILLISECOND,
r.[estimated_completion_time],
CURRENT_TIMESTAMP
),
current_command = SUBSTRING
(
t.[text],
r.[statement_start_offset]/2,
COALESCE(NULLIF(r.[statement_end_offset], -1)/2, 2147483647)
),
module = COALESCE(QUOTENAME(OBJECT_SCHEMA_NAME(t.[objectid], t.[dbid]))
+ '.' + QUOTENAME(OBJECT_NAME(t.[objectid], t.[dbid])), '<ad hoc>'),
[status] = UPPER(s.[status])
FROM
sys.dm_exec_connections AS c
INNER JOIN
sys.dm_exec_sessions AS s
ON c.session_id = s.session_id
LEFT OUTER JOIN
sys.dm_exec_requests AS r
ON r.[session_id] = s.[session_id]
OUTER APPLY
sys.dm_exec_sql_text(r.[sql_handle]) AS t
WHERE
c.session_id = 54;
If you are confident that you can sever this connection you can use:
KILL 54;
Just be aware that depending on what the session was doing it could leave data and/or the app that called it in a weird state.
In cases of deadlock, the following query should be run at regular intervals.
DBCC opentran()
If then the same SPID number is returned multiple times in the following report
Oldest active transaction:
SPID (server process ID): 131
UID (user ID) : -1
Name : implicit_transaction
LSN : (634998:226913:1)
Start time : Jan 19 2022 6:36:36:360PM
SID : 0x010500000000000515000000c6bb507a9dbeda5275b975547b3e0000
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Completion time: 2022-01-19T18:36:38.8421769+03:00
Then make a detail query for this transaction. It is critical to permanently resolve the source of this problem.
exec sp_who2 131
exec sp_lock 131
After investigating the cause, you can resolve the deadlock by killing that process.
KILL 131
If you want to see all SPIDs and blocked as tables, you should use the following query.
SELECT spid, blocked,[dbid],last_batch,open_tran
FROM master.sys.sysprocesses
WHERE open_tran <> 0
I ended up running into the situation of locking up a sessions as reported by DBCC OPENTRAN but due to the corporate lock down of the Server/database my ability to KILL was not available.
I discovered that the app I was using to execute the script(s), VS 2022, was complicit, so to speak, in keeping the transactions alive. By closing the app, it notified me that there were active sessions running and that closing could have consequences. By accepting the notifications and closing the app, the open transactions would subsequently be closed.