Blocked on trip to MARS - sql-server

The database tool I'm writing investigates blocked queries by running a parallel query against sys.dm_exec_requests if the main query got delayed to find the cause of the delay.
That works fine if the investigating connection has the VIEW SERVER STATE permission. If not, however, sys.dm_exec_requests only contains entries for the connection it runs on - which is somewhat pointless for connections where only one query can run at a time.
Enter MARS, the first time I was thinking this arcane feature may be useful for something.
With MARS enabled, I can run the investigating query on the same connection as the delayed query we're investigating.
However, a simple test shows that if the first MARS query is blocked, apparently the second one is also, even if the second has no reason to be.
I'm running this test code in LinqPad (with Dappper for a tighter code sample, but I got the same effect in my app that doesn't use Dapper):
var csb = new SqlConnectionStringBuilder();
csb.TrustServerCertificate = true;
csb.DataSource = #".\";
csb.InitialCatalog = "...";
csb.IntegratedSecurity = true;
using var c0 = new SqlConnection(csb.ConnectionString);
csb.MultipleActiveResultSets = true;
using var c1 = new SqlConnection(csb.ConnectionString);
using var c2 = new SqlConnection(csb.ConnectionString);
// Begin the blocking transaction on connection #0
await c0.QueryAsync(#"
begin transaction
select * from mytable with (tablockx, holdlock)
");
// This query on connection #1 is blocked by connection #0
var blockedTask = c1.QuerySingleAsync<int>("select count(*) from mytable");
// Strangely, this second query is blocked as well
var requests = await c1.QueryAsync(#"
select session_id, cpu_time, reads, logical_reads
from sys.dm_exec_requests r
");
// We don't get here unless you swap `c1` for `c2` in the last query, making
// it run on it's own connection, thus requiring VIEW SERVER STATE to be useful
requests.Dump();
await blockedTask;
You just need a database with any random table to apply this.

MARS allows interleaved execution of multiple requests on the same connection, not concurrent execution.
In the case of a blocked SELECT query, other queries on the same connection cannot execute until the select query completes or yields by returning results.

Related

SQL Server :: Replication Distribution Agent never ending

I'm running SQL Server 2019 Always ON Availability Group with an asynchronous replication.
I use a free tool called IDERA SQL Check and I have spotted the SPID 69 which program name is Replication Distribution Agent. It's always there, staring at me like a bored cat.
This SPID 69 is pointing to a specific database which is mirrored I investigated it with this the query:
select
s.session_id
,login_name
,login_time
,host_name
,program_name
,status
,cpu_time
,memory_usage
,total_scheduled_time
,total_elapsed_time
,last_request_start_time
,reads
,writes
,logical_reads
from sys.dm_exec_sessions s
inner join sys.dm_exec_connections c
on s.session_id = c.session_id
outer apply sys.dm_exec_sql_text(c.most_recent_sql_handle) st
where s.is_user_process = 1
and s.open_transaction_count > 0;
Which gave me this response:
session_id = 69
text = begin tran
login_time = 2020-09-08 18:40:57.153
program_name = Replication Distribution Agent
status = sleeping
cpu_time = 1362772
memory_usage = 4
total_scheduled_time = 1689634
total_elapsed_time = 22354857
last_request_start_time = 2020-09-28 16:28:39.433
reads = 18607577
writes = 5166597
logical_reads = 112256365
Now, on internet I find that when you see Replication Distribution Agent is all good, that agent should be going and there should be no problem. But why:
The text says begin tran and nothing more?
IDERA SQL Check is labelling it as connection idling transaction?
The status is sleeping?
I'm concerned that CPU time, reads and writes are basically telling me that this process is frying the drive with never ending I/O, am I right?
This is perfectly normal.
The replication distribution agent is effectively running continuously to scan the transactions on your source to be able to send them to the replicas. Because it needs to capture these and forward them, it has to run continuously.
It is not frying your drive - unless your transaction rate is so high that that is actually frying your drive. It shows high reads in an incremental manner - this is cumulative values and not a snapshot of current. That suggests that it has read the equivalent of 141GB over 20 days - not particularly heavy use.

Introducing delays on sql server db for testing timeouts in API

I have an IIS application that is hosted in a Windows VM which runs queries on a local SQL server. For the purposes of testing timeouts and few other configuration changes, I want to be able to simulate a delay whenever a query is executed on the DB by the IIS. I have researched this for a bit and found the following:
https://dba.stackexchange.com/questions/144591/need-to-intentionally-create-blocking-processes-for-testing.
However, this is not working as I cannot see the effective delay in the API response. Note that I am executing the query mentioned in the link above through the local SSMS in the VM.
Queries executed:
Window 1:
BEGIN TRANSACTION
SELECT * FROM TestAudit.dbo.TestAT WITH (TABLOCKX, HOLDLOCK)
WAITFOR DELAY '00:00:30'
ROLLBACK TRANSACTION
Window 2:
SELECT TestTId,
TestAId,
TestST,
TestO,
TestRC,
TestSD
FROM TestAudit.dbo.TestAT
WHERE (
ServerTimestamp >= '2018-10-16 01:48:21.344'
AND ServerTimestamp <= '2018-10-16 01:48:22.344')
AND TestAId = '2000093309'
AND TestO IN ('A', 'B')
AND (
ClientName <> 'Test.Admin'
AND ClientName NOT LIKE '%TestIgnore%');
Expecting a delay when executing the second query, given that the first has been executed in a different window. But do not experience any delay.

is there something faster than Enumerable.Except<TSource> Method?

I have a program that downloads data from server database to client database. server database keeps growing recently.
in that program, there is an option to select download all data OR download data for a specific time period (can select backward days from today). if the user selects all, I wrote the program to truncate client database table and insert all data using bulk copy. that part is ok.
but the problem is when user select a specific time period (each recode has created data time ) program has to compare two tables and divide recodes (server data) in two tables. one is, not exist data and the second one is not existing data. and what I'm going to do is,
not existing data directly insert into client DB (i'm using bulk insert) and Existing data inserting into a tempory table using bulkcopy and after update client's table using the above tempory table. My actual problem occurs when dividing server's table. this is how I did it
updateTable = (From c In dt_from_server.AsEnumerable()
Join o In Dt_from_client.AsEnumerable()
On c.Field(Of String)("BARCODE").Trim() Equals o.Field(Of String)("BARCODE").Trim()
And c.Field(Of String)("ITEM_CODE").Trim() Equals o.Field(Of String)("ITEM_CODE").Trim()
Select c).CopyToDataTable()
insertTable = dt_server.AsEnumerable()
.Except(updateTable.AsEnumerable(), DataRowComparer.Default)
.CopyToDataTable()
(normally there is over 1M recodes in the server table )
when there is over 1 Milion recodes, Update part taking acceptable time like 10 minutes (Yes it taking 5GB space from Ram - in this case, it's ok when considering performance )
but insert part seams taking days, just to assing the insertTable(datatable). this is the issue.
AsEnumerable().Except() part taking long time and I couldn't find a solution speedup this process. I'm not sure I explained this correctly. Could anyone can give me some advice for this?
Since you have commented that dt_from_server and dt_server are actually the same DataTable you don't need to compare all values of all DataRows with each other, which is what DataRowComparer.Default does. You can use Except without second parameter for the comparer, then only references are compared which is much faster.
You also don't need two CopyToDataTable which creates two additonal big DataTables in memory, process the rows one after the other.
Here is a different approach using Linq's left-outer join, which is more efficient:
Dim query = from rServ in dt_from_server.AsEnumerable()
group join rClient in Dt_from_client.AsEnumerable()
On New With{
Key .BarCode = rServ.Field(Of String)("BARCODE").Trim(),
Key .ItemCode = rServ.Field(Of String)("ITEM_CODE").Trim()
} Equals New With{
Key .BarCode = rClient.Field(Of String)("BARCODE").Trim(),
Key .ItemCode = rClient.Field(Of String)("ITEM_CODE").Trim()
} into Group
From client In Group.DefaultIfEmpty()
Select new With { .ServerRow = rServ, .InsertRow = client is Nothing }
Dim insertOrUpdateRows = query.ToLookup(Function(x) x.InsertRow, Function(x) x.ServerRow)
Dim insertRows = insertOrUpdateRows(true).CopyToDataTable() 'CopyToDataTable redundant if you process rows immediately now'
Dim updateRows = insertOrUpdateRows(false).CopyToDataTable() 'CopyToDataTable redundant if you process rows immediately now'
But in general the most scalable and efficient approach would be to not load all into memory at once and then process all, but to use database paging(or a stored-procedure) to process only parts of it in memory, otherwise it's likely that you will encounter a OutOfMemoryException sooner or later.
C# as requested:
var query = from rServ in dt_from_server.AsEnumerable()
join rClient in Dt_from_client.AsEnumerable()
on new { BarCode = rServ.Field<string>("BARCODE").Trim(), ItemCode = rServ.Field<string>("ITEM_CODE").Trim() }
equals new { BarCode = rClient.Field<string>("BARCODE").Trim(), ItemCode = rClient.Field<string>("ITEM_CODE").Trim() }
into clientGroup
from client in clientGroup.DefaultIfEmpty()
select new { ServerRow = rServ, InsertRow = client == null };
var insertOrUpdateRows = query.ToLookup(x => x.InsertRow, x => x.ServerRow);
var insertRows = insertOrUpdateRows[true].CopyToDataTable(); // CopyToDataTable redundant if you process rows immediately now
var updateRows = insertOrUpdateRows[false].CopyToDataTable(); // CopyToDataTable redundant if you process rows immediately now

SQL CLR Trigger - get source table

I am creating a DB synchronization engine using SQL CLR Triggers in Microsoft SQL Server 2012. These triggers do not call a stored procedure or function (and thereby have access to the INSERTED and DELETED pseudo-tables but do not have access to the ##procid).
Differences here, for reference.
This "sync engine" uses mapping tables to determine what the table and field maps are for this sync job. In order to determine the target table and fields (from my mapping table) I need to get the source table name from the trigger itself. I have come across many answers on Stack Overflow and other sites that say that this isn't possible. But, I've found one website that provides a clue:
Potential Solution:
using (SqlConnection lConnection = new SqlConnection(#"context connection=true")) {
SqlCommand cmd = new SqlCommand("SELECT object_name(resource_associated_entity_id) FROM sys.dm_tran_locks WHERE request_session_id = ##spid and resource_type = 'OBJECT'", lConnection);
cmd.CommandType = CommandType.Text;
var obj = cmd.ExecuteScalar();
}
This does in fact return the correct table name.
Question:
My question is, how reliable is this potential solution? Is the ##spid actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id? Will it stand up to multiple executions of the same and/or different triggers within the database?
From these sites, it seems the process Id is in fact limited to the open connection, which doesn't overlap: here, here, and here.
Will this be a safe method to get my source table?
Why?
As I've noticed similar questions, but all without a valid answer for my specific situation (except that one). Most of the comments on those sites ask "Why?", and in order to preempt that, here is why:
This synchronization engine operates on a single DB and can push changes to target tables, transforming the data with user-defined transformations, automatic source-to-target type casting and parsing and can even use the CSharpCodeProvider to execute methods also stored in those mapping tables for transforming data. It is already built, quite robust and has good performance metrics for what we are doing. I'm now trying to build it out to allow for 1:n table changes (including extension tables requiring the same Id as the 'master' table) and am trying to "genericise" the code. Previously each trigger had a "target table" definition hard coded in it and I was using my mapping tables to determine the source. Now I'd like to get the source table and use my mapping tables to determine all the target tables. This is used in a medium-load environment and pushes changes to a "Change Order Book" which a separate server process picks up to finish the CRUD operation.
Edit
As mentioned in the comments, the query listed above is quite "iffy". It will often (after a SQL Server restart, for example) return system objects like syscolpars or sysidxstats. But, it seems that in the dm_tran_locks table there's always an associated resource_type of 'RID' (Row ID) with the same object_name. My current query which works reliably so far is the following (will update if this changes or doesn't work under high load testing):
select t1.ObjectName FROM (
SELECT object_name(resource_associated_entity_id) as ObjectName
FROM sys.dm_tran_locks WHERE resource_type = 'OBJECT' and request_session_id = ##spid
) t1 inner join (
SELECT OBJECT_NAME(partitions.OBJECT_ID) as ObjectName
FROM sys.dm_tran_locks
INNER JOIN sys.partitions ON partitions.hobt_id = dm_tran_locks.resource_associated_entity_id
WHERE resource_type = 'RID'
) t2 on t1.ObjectName = t2.ObjectName
If this is always the case, I'll have to find that out during testing.
How reliable is this potential solution?
While I do not have time to set up a test case to show it not working, I find this approach (even taking into account the query in the Edit section) "iffy" (i.e. not guaranteed to always be reliable).
The main concerns are:
cascading (whether recursive or not) Trigger executions
User (i.e. Explicit / Implicit) transactions
Sub-processes (i.e. EXEC and sp_executesql)
These scenarios allow for multiple objects to be locked, all at the same time.
Is the ##SPID actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id?
and (from a comment on the question):
I think I can join my query up with the sys.partitions and get a dm_trans_lock that has a type of 'RID' with an object name that will match up to the one in my original query.
And here is why it shouldn't be entirely reliable: the Session ID (i.e. ##SPID) is constant for all of the requests on that Connection). So all sub-processes (i.e. EXEC calls, sp_executesql, Triggers, etc) will all be on the same ##SPID / session_id. So, between sub-processes and User Transactions, you can very easily get locks on multiple resources, all on the same Session ID.
The reason I say "resources" instead of "OBJECT" or even "RID" is that locks can occur on: rows, pages, keys, tables, schemas, stored procedures, the database itself, etc. More than one thing can be considered an "OBJECT", and it is possible that you will have page locks instead of row locks.
Will it stand up to multiple executions of the same and/or different triggers within the database?
As long as these executions occur in different Sessions, then they are a non-issue.
ALL THAT BEING SAID, I can see where simple testing would show that your current method is reliable. However, it should also be easy enough to add more detailed tests that include an explicit transaction that first does some DML on another table, or have a trigger on one table do some DML on one of these tables, etc.
Unfortunately, there is no built-in mechanism that provides the same functionality that ##PROCID does for T-SQL Triggers. I have come up with a scheme that should allow for getting the parent table for a SQLCLR Trigger (that takes into account these various issues), but haven't had a chance to test it out. It requires using a T-SQL trigger, set as the "first" trigger, to set info that can be discovered by the SQLCLR Trigger.
A simpler form can be constructed using CONTEXT_INFO, if you are not already using it for something else (and if you don't already have a "first" Trigger set). In this approach you would still create a T-SQL Trigger, and then set it as the "first" Trigger using sp_settriggerorder. In this Trigger you SET CONTEXT_INFO to the table name that is the parent of ##PROCID. You can then read CONTEXT_INFO() on a Context Connection in a SQLCLR Trigger. If there are multiple levels of Triggers then the value of CONTEXT INFO will get overwritten, so reading that value must be the first thing you do in each SQLCLR Trigger.
This is an old thread, but it is an FAQ and I think I have a better solution. Essentially it uses the schema of the inserted or deleted table to find the base table by doing a hash of the column names and comparing the hash with the hashes of tables with a CLR trigger on them.
Code snippet below - at some point I will probably put the whole solution on Git (it sends a message to Azure Service Bus when the trigger fires).
private const string colqry = "select top 1 * from inserted union all select top 1 * from deleted";
private const string hashqry = "WITH cols as ( "+
"select top 100000 c.object_id, column_id, c.[name] "+
"from sys.columns c "+
"JOIN sys.objects ot on (c.object_id= ot.parent_object_id and ot.type= 'TA') " +
"order by c.object_id, column_id ) "+
"SELECT s.[name] + '.' + o.[name] as 'TableName', CONVERT(NCHAR(32), HASHBYTES('MD5',STRING_AGG(CONVERT(NCHAR(32), HASHBYTES('MD5', cols.[name]), 2), '|')),2) as 'MD5Hash' " +
"FROM cols "+
"JOIN sys.objects o on (cols.object_id= o.object_id) "+
"JOIN sys.schemas s on (o.schema_id= s.schema_id) "+
"WHERE o.is_ms_shipped = 0 "+
"GROUP BY s.[name], o.[name]";
public static void trgSendSBMsg()
{
string table = "";
SqlCommand cmd;
SqlDataReader rdr;
SqlTriggerContext trigContxt = SqlContext.TriggerContext;
SqlPipe p = SqlContext.Pipe;
using (SqlConnection con = new SqlConnection("context connection=true"))
{
try
{
con.Open();
string tblhash = "";
using (cmd = new SqlCommand(colqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
if (rdr.Read())
{
MD5 hash = MD5.Create();
StringBuilder hashstr = new StringBuilder(250);
for (int i=0; i < rdr.FieldCount; i++)
{
if (i > 0) hashstr.Append("|");
hashstr.Append(GetMD5Hash(hash, rdr.GetName(i)));
}
tblhash = GetMD5Hash(hash, hashstr.ToString().ToUpper()).ToUpper();
}
rdr.Close();
}
}
using (cmd = new SqlCommand(hashqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
while (rdr.Read())
{
string hash = rdr.GetString(1).ToUpper();
if (hash == tblhash)
{
table = rdr.GetString(0);
break;
}
}
rdr.Close();
}
}
if (table.Length == 0)
{
p.Send("Error: Unable to find table that CLR trigger is on. Message not sent!");
return;
}
….
HTH

SQl Server deadlock with simple update statement

Using SQL Server 2008 R2 I am getting deadlocks when the same update statement (with different parameters) is running concurrently. Here is the deadlock graph (sorry cannot post images on here yet):
http://i.stack.imgur.com/E6JBK.png
And here is the actual execution plan:
http://i.stack.imgur.com/emm9i.png
The update is like this:
exec sp_executesql N'UPDATE mapping.IssuerAlternateName
SET
UseCount = UseCount + 1,
MostRecentlyAppeared = GETDATE(),
MostRecentlyAppearedUnderlyingAssetName = #p1
WHERE ID = #p0
',N'#p0 int,#p1 nvarchar(4000)',#p0=1234,#p1=N'blah blah blah'
If I have understood things correctly we are trying to read and write from the same index (PK_IssuerAlternateName_1).
Is there any way to resolve this? I was wondering if adding an additional index to the primary key and using WITH INDEX might fix it by stopping the read of PK_IssuerAlternateName_1 (sorry the full name is truncated in the execution plan screenshot).
Or is the best option just to live with this and retry the transaction, which is how the error is currently handled in .NET client. It is certainly successful on retry, but it would be good to avoid the deadlock if possible.
Thanks
In situations similar to this, I have used the UPDLOCK hint to let the database know I intend to update this row. It is not implied by the UPDATE statement. Without the lock hint, it will first obtain a "shared" lock, and then try to escalate. However, this causes deadlocks in certain scenarios.
You will need to do this within your own TransactionScope to ensure everything works correctly.
var sql = #"UPDATE mapping.IssuerAlternateName with (UPDLOCK)
SET
UseCount = UseCount + 1,
MostRecentlyAppeared = GETDATE(),
MostRecentlyAppearedUnderlyingAssetName = #p1
WHERE ID = #p0";
var options = new TransactionOptions()
{
IsolationLevel = IsolationLevel.ReadCommitted // don't use Serializable!
};
using (var scope = new TransactionScope(TransactionScopeOption.RequiresNew, options))
{
using (var context = new YourDbContext())
{
// execute your command here
}
}

Resources