What does SQL Server sys.dm_broker_activated_tasks tell me? - sql-server

I have a server broker application I inherited that abends with a false negative.
I think it is using sys.dm_broker_activated_tasks incorrectly, and I want to validate that my understanding of what that view shows is correct.
Can I assume that this view is showing tasks being activated, and no so much those that were activated, but are now in the process of completing?
The procedure I have monitors for completion of processing by looking for when there are no entries in sys.dm_broker_activated_tasks for that queue.
This appears to work (mostly), except occasionally at the end when processing in the queue is winding down.
The row in that table seems to disappears before the final message in the queue has completed.
And unfortunately, as this uses the fire and forget anti-pattern, I can't really at this time do more than make the polling monitor a bit smarter.

That view doesn't do much apart from:
Returns a row for each stored procedure activated by Service Broker.
https://msdn.microsoft.com/en-us/library/ms175029.aspx
Not sure if you have looked at the code, but I think a better usage of it is to combine it with sys.dm_exec_sessions
select
at.spid
,DB_NAME(at.database_id) AS [DatabaseName]
,at.queue_id
,at.[procedure_name]
,s.[status]
,s.login_time
from
sys.dm_broker_activated_tasks at
inner join
sys.dm_exec_sessions s
on
at.spid = s.session_id;
Another good place to troubleshoot Service Broker is sys.transmission_queue. You will see every message sent there until there is an acknowledgement that it was received.

Related

Transaction causing stored procedure to hang

We have a windows service that orchestrates imports to a database. The service itself isn’t the problem as this essentially just creates a scheduled job, waits until it completes and then fires a series of stored procs. The problem is that one of the procs appears to be getting stuck midway through. It’s not throwing an error and so I have nothing to that I can give as a definitive problem. I have narrowed it down to a single proc that gets called after the job has completed. I’ve even managed to narrow it down to a specific line of code, but that’s where I’m struggling.
The proc, will define a transaction name at the start, being the name of the proc and a datetime. It also gets a transaction count from ##TranCount. It then defines a cursor that loops the files associated with the event. Inside a try block it dynamically creates a view (which definitely happens as I write a log entry afterwards). Immediately after this, there is an IF condition that either creates or saves the transaction based on whether the variable holding ##TranCount is zero or not. Inside this condition I write a message to our log table BEFORE the transaction is created/saved.
Immediately after (regardless of whether it’s a create or a save) I write another log message. The log entry is there. The times we’ve seen this pausing, the proc always writes the create transaction log message. It doesn’t get as far as writing the message outside the condition. The only thing that happens between the first message (pre create/save trans) and the second message (post trans) is the create/save transaction. As the message being logged is the create message, there can’t be a transaction open (##TranCount must have been zero). However, as no error is raised I can’t say with 100% certainty that this is the case. The line that seems to stop is the CREATE TRANSACTION #TransactionName line. This seems to imply that something is locking and preventing the statement from being executed. The problem is we can see no open transactions (DBCC reports nothing open), the proc just hangs there.
We’re fairly certain that it’s a lock of some description, but completely baffled as to what. To add a level of complexity, it doesn’t occur every time. Some times with the same file, we can run the process without any issue on this database. We’ve tried running the file against another database with no luck in replicating the problem, but we have seen it occur on other databases on this server (the server holds multiple client databases that do the same thing). This also only happens on this server. We have other servers in the environment, with seemingly identical configs, where we haven't seen this issue surface.
Unfortunately we can’t post any of the code due to internal rules, but any ideas would be appreciated.
Try using sp_whoisactive and enable the lock flag. I also recommend finding the query plan with the code below and analyzing the stats there.
SELECT * FROM
(
SELECT DB_NAME(p.dbid) AS DBName ,
OBJECT_NAME(p.objectid, p.dbid) AS OBJECT_NAME ,
cp.usecounts ,
p.query_plan ,
q.text ,
p.dbid ,
p.objectid ,
p.number ,
p.encrypted ,
cp.plan_handle
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS q
WHERE cp.cacheobjtype = 'Compiled Plan'
)a WHERE text LIKE '%SNIPPET OF SQL GOES HERE THAT IS PART OF THE QUERY YOU WANT TO FIND%'
ORDER BY dbid, objectID
Have you thought about checking the connection properties for the service? If this is set too low and the proc takes longer to run this will cause it to drop the connection and kill of the process.
This is much more likely than it being anything to do with the name of the transaction.

Capture TSQL when Missing_Join_Predicate event occurs

I am configuring Event Notifications on a Service Broker Queue to log when various performance related events occur. One of these is Missing_Join_Predicate.
The XML payload of this event holds nothing useful for me to identify the cause (TSQL, query plan, objectid(s) etc) so in the procedure to process the queue I am trying to use the TransactionID to query dm_exec_requests and dm_exec_query_plan to get the query plan and the TSQL where the dm_exec_requests.transactionid is the TransactionID from the event.
The code catches no data.
Removing the filter from the query (ie collecting all rows from dm_exec_requests and dm_exec_query_plan) shows there are records returned but none for the TransactionID in question.
Is what I am trying to do possible? Where am I going wrong?!
The trace based event notifications, like MISSING_JOIN_PREDICATE, are just a verbatim translation of the corresponding trace event (Missing Join Predicate Event Class) and carry exactly the same info. For this particular event there's basically no useful info whatsoever, the <TransactionID> is the xact id that triggered the event and, by the time you dequeue it and process the notification message, the transaction is most likely finished and gone (yay for asynchronous queued based processing).
When using the original trace error event, eg. with Profiler, you could also enable SQL:BatchCompleted, filter appropriately and then catch the JOIN culprit in the act. But with Event Notifications I don't see any feasible way to automate the process to the point at which you can pinpoint the problem query and application. With EN you can, at best, raise the awareness of the problem, show the client that cause it (the app), and then use other means (eg. code inspection) to actually hunt down the problem root cause.
Unfortunately you'll discover that there are more event notification events that suffer from similar problem(s).

Sql Server Service Broker - thorough, in-use example of externally activated console app

I need some guidance from anyone who has deployed a real-world, in-production application that uses the Sql Server Service Broker external activation mechanism (via the Service Broker External Activator from the Feature Pack).
Current mindset:
My specs are rather simple (or at least I think so), so I'm thinking of the following basic flow:
order-like entity gets inserted into a Table_Orders with state "confirmed"
SP_BeginOrder gets executed and does the following:
begins a TRANSACTION
starts a DIALOG from Service_HandleOrderState to Service_PreprocessOrder
stores the conversation handle (from now on PreprocessingHandle) in a specific column of the Orders table
sends a MESSAGE of type Message_PreprocessOrder containing the order id using PreprocessingHandle
ends the TRANSACTION
Note that I'm not ending the conversation, I don't want "fire-and-forget"
event notification on Queue_PreprocessOrder activates an instance of PreprocessOrder.exe (max concurrent of 1) which does the following:
begins a SqlTransaction
receives top 1 MESSAGE from Queue_PreprocessOrder
if message type is Message_PreprocessOrder (format XML):
sets the order state to "preprocessing" in Table_Orders using the order id in the message body
loads n collections of data of which computes an n-ary Carthesian product (via Linq, AFAIK this is not possible in T-SQL) to determine the order items collection
inserts the order items rows into a Table_OrderItems
sends a MESSAGE of type Message_PreprocessingDone, containing the same order id, using PreprocessingHandle
ends the conversation pertaining to PreprocessingHandle
commits the SqlTransaction
exits with Environment.Exit(0)
internal activation on Queue_HandleOrderState executes a SP (max concurrent of 1) that:
begins a TRANSACTION
receives top 1 MESSAGE from Queue_InitiatePreprocessOrder
if message type is Message_PreprocessingDone:
sets the order state to "processing" in Table_Orders using the order id in the message body
starts a DIALOG from Service_HandleOrderState to Service_ProcessOrderItem
stores the conversation handle (from now on ProcessOrderItemsHandle) in a specific column of Table_Orders
creates a cursor for rows in Table_OrderItems for current order id and for each row:
sends a MESSAGE of type Message_ProcessOrderItem, containing the order item id, using ProcessOrderItemsHandle
if message type is Message_ProcessingDone:
sets the order state to "processed" in Table_Orders using the order id in the message body
if message type is http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog (END DIALOG):
ends the conversation pertaining to conversation handle of the message
ends the TRANSACTION
event notification on Queue_ProcessOrderItem activates an instance of ProcessOrderItem.exe (max concurrent of 1) which does the following:
begins a SqlTransaction
receives top 1 MESSAGE from Queue_ProcessOrderItem
if message type is Message_ProcessOrderItem (format XML):
sets the order item state to "processing" in Table_OrdersItems using the order item id in the message body, then:
loads a collection of order item parameters
makes a HttpRequest to a URL using the parameters
stores the HttpResponse as a PDF on filesystem
if any errors occurred in above substeps, sets the order item state to "error", otherwise "ok"
performs a lookup in the Table_OrdersItems to determine if all order items are processed (state is "ok" or "error")
if all order items are processed:
sends a MESSAGE of type Message_ProcessingDone, containing the order id, using ProcessOrderItemsHandle
ends the conversation pertaining to ProcessOrderItemsHandle
commits the SqlTransaction
exits with Environment.Exit(0)
Notes:
specs specify MSSQL compatibility 2005 through 2012, so:
no CONVERSATION GROUPS
no CONVERSATION PRIORITY
no POISON_MESSAGE_HANDLING ( STATUS = OFF )
I am striving to achieve overall flow integrity and continuity, not speed
given that tables and SPs reside in DB1 whilst Service Broker objects (messages, contracts, queues, services) reside in DB2, DB2 is SET TRUSTWORTHY
Questions:
Are there any major design flaws in the described architecture ?
Order completion state tracking doesn't seem right. Is there a better method ? Maybe using QUEUE RETENTION ?
My intuition tells me that in no case whatsoever should the activated external exe terminate with an exit code other than 0, so there should be try{..}catch(Exception e){..} finally{ Environment.Exit(0) } in Main. Is this assumption correct ?
How would you organize error handling in DB code ? Is an error log table enough?
How would you organize error handling in external exe C# code ? Same error logging
table ?
I've seen the SQL Server Service Broker Product Samples, but the Service Broker Interface seems overkill for my seemingly simpler case. Any alternatives for a simpler Service Broker object model ?
Any cross-version "portable" admin tool for Service Broker capable of at least draining poison messages ?
Have you any decent code samples for any of the above ?
Q: Are there any major design flaws in the described architecture ?
A: Couple of minor perks:
- waiting for an HTTP request to complete while holding open a transaction is bad. You can't achieve transactional consistency between a database and HTTP anyway, so don't risk to have a transaction stretch for minutes when the HTTP is slow. The typical pattern is to {begin tran/receive/begin conversation timer/commit} then issue the HTTP call w/o any DB xact. If the HTTP call succeeds then {begin xact/send response/end conversation/commit}. If the HTTP fails (or client crashes) then let the conversation time activate you again. You'll get a timer message (no body), you need to pick up the item id associated with the handle from your table(s).
Q: Order completion state tracking doesn't seem right. Is there a better method ? Maybe using QUEUE RETENTION ?
A: My one critique of your state tracking is the dependency on scanning the order items to determine that the current processed one is the last one (5.3.4). For example you could add the information that this is the 'last' item to be processed in the item state so you know, when processing it, that you need to report the completion. RETENTION is only useful in debugging or when you have logic that require to run 'logical rollback' and to compensating actions on conversation error.
Q: My intuition tells me that in no case whatsoever should the activated external exe terminate with an exit code other than 0, so there should be try{..}catch(Exception e){..} finally{ Environment.Exit(0) } in Main. Is this assumption correct ?
A: The most important thing is for the activated process to issue a RECEIVE statement on the queue. If it fails to do so the queue monitor may enter the notified state forever. Exit code is, if I remember correctly, irrelevant. As with any background process is important to catch and log exceptions, otherwise you'll never even know it has a problem when it start failing. In addition to disciplined try/catch blocks, Hookup Application.ThreadException for UI apps and AppDomain.UnhandledException for both UI and non-UI apps.
Q: How would you organize error handling in DB code ? Is an error log table enough?
A: I will follow up later on this. Error log table is sufficient imho.
Q: How would you organize error handling in external exe C# code ? Same error logging table ?
A: I created bugcollect.com exactly because I had to handle such problems with my own apps. The problem is more than logging, you also want some aggregation and analysis (at least detect duplicate reports) and suppress floods of errors from some deployment config mishap 'on the field'. Truth be told nowadays there are more options, eg. exceptron.com. And of course I think FogBugs also has logging capabilities.
Q: I've seen the SQL Server Service Broker Product Samples, but the Service Broker Interface seems overkill for my seemingly simpler case. Any alternatives for a simpler Service Broker object model ?
finally, an easy question: Yes, it is overkill. There is no simple model.
Q: Any cross-version "portable" admin tool for Service Broker capable of at least draining poison messages ?
A: The problem with poison messages is that the definition of poison message changes with your code: the poison message is whatever message breaks the current guards set in place to detect it.
Q: Have you any decent code samples for any of the above ?
A: No
One more point: try to avoid any reference from DB1 to DB2 (eg. 4.3.4 is activated in DB1 and reads the items table from DB2). This creates cross DB dependencies which break when a) one DB is offline (eg. for maintenance) or overloaded or b) you add database mirroring for HA/DR and one DB fails over. Try to make the code to work even if DB1 and DB2 are on different machines (and no linked servers). If necessary, add more info to the messages payload. And if you architect it that way that DB2 can be on a different machine and even multiple DB2 machines can exists to scale out the HTTP/PDF writing work.
And finally: this design will be very slow. I'm talking low tens messages per second slow, with so many dialogs/messages involved and everything with max_queue_readers 1. This may or may not be acceptable for you.

How can you find out what has a Service Broker conversation locked?

I have a service broker queue with one conversation that is not being processed by anything. However it is not returning anything when I run:
RECEIVE TOP(1000) * FROM dbo.QueueName
However if I run this:
SELECT COUNT(*) FROM dbo.QueueName
I am getting a figure in the thousands. The figure is not changing either. I assume that some process has the Conversation Group locked but is not doing anything with it. How can I tell if this is the case and how can I tell which SPID has the lock?
If you have access to system views, sys.dm_os_waiting_tasks and sys.dm_tran_locks should get you where you need to go. The former is high level "what's my process waiting on" type of stuff. From that, you can get which SPID(s) are blocking your query. From that list of SPIDs, you can (if you're interested) query the locks view to see what locks they're holding. Incidentally, I found that one of either allow_snapshot_isolation or read_committed_snapshot on the database helped with queue locking in a recent engagement.

At what point will a series of selected SQL statements stop if I cancel the execution request in SQL Server Management Studio?

I am running a bunch of database migration scripts. I find myself with a rather pressing problem, that business is waking up and expected to see their data, and their data has not finished migrating. I also took the applications offline and they really need to be started back up. In reality "the business" is a number of companies, and therefore I have a number of scripts running SPs in one query window like so:
EXEC [dbo].[MigrateCompanyById] 34
GO
EXEC [dbo].[MigrateCompanyById] 75
GO
EXEC [dbo].[MigrateCompanyById] 12
GO
EXEC [dbo].[MigrateCompanyById] 66
GO
Each SP calls a large number of other sub SPs to migrate all of the data required. I am considering cancelling the query, but I'm not sure at what point the execution will be cancelled. If it cancels nicely at the next GO then I'll be happy. If it cancels mid way through one of the company migrations, then I'm screwed.
If I cannot cancel, could I ALTER the MigrateCompanyById SP and comment all the sub SP calls out? Would that also prevent the next one from running, whilst completing the one that is currently running?
Any thoughts?
One way to acheive a controlled cancellation is to add a table containing a cancel flag. You can set this flag when you want to cancel exceution and your SP's can check this at regular intervals and stop executing if appropriate.
I was forced to cancel the script anyway.
When doing so, I noted that it cancels after the current executing statement, regardless of where it is in the SP execution chain.
Are you bracketing the code within each migration stored proc with transaction handling (BEGIN, COMMIT, etc.)? That would enable you to roll back the changes relatively easily depending on what you're doing within the procs.
One solution I've seen, you have a table with a single record having a bit value of 0 or 1, if that record is 0, your production application disallows access by the user population, enabling you to do whatever you need to and then set that flag to 1 after your task is complete to enable production to continue. This might not be practical given your environment, but can give you assurance that no users will be messing with your data through your app until you decide that it's ready to be messed with.
you can use this method to report execution progress of your script.
the way you have it now is every sproc is it's own transaction. so if you cancel the script you will get it update only partly up to the point of the last successfuly executed sproc.
you cna however put it all in a singel transaction if you need all or nothign update.

Resources