I have a long-running SP (it can run for up to several minutes) that basically performs a number of cleanup operations on various tables within a transaction. I'm trying to determine the best way to somehow pass human-readable status information back to the caller on what step of the process the SP is currently performing.
Because the entire SP runs inside a single transaction, I can't write this information back to a status table and then read it from another thread unless I use NOLOCK to read it, which I consider a last resort since:
NOLOCK can cause other data inconsistency issues; and
this places the onus on anyone wanting to read the status table that they need to use NOLOCK because the table or row(s) could be locked for quite a while.
Is there any way to issue a single command (or EXEC a second SP) within a transaction and tell specify that that particular command shouldn't be part of the transaction? Or is there some other way for ADO.NET to gain insight into this long-running SP to see what it is currently doing?
You can PRINT messages in T-SQL and get them delivered to your SqlConnection in ADO.NET via the "InfoMessage" event. See
http://msdn.microsoft.com/en-us/library/a0hee08w.aspx
for details.
You could try using RAISERROR (use a severity of 10 or lower) within the procedure to return informational messages.
Example:
RAISERROR(N'Step 5 completed.', 10, 1) WITH NOWAIT;
Related
I have a stored procedure which cannot be executed concurrently. Multiple processes call this stored procedure, but it is of vital importance that the processes access the stored procedure sequentially.
The stored procedure basically scans a table for a primary key that meets various conditions, marks the record as in-use by the calling process, and then passes the primary key back to the calling process.
Anywhere from one to a dozen instances of the calling process could exist, depending upon how much work exists.
I decided to prevent concurrency by using sp_GetAppLock inside the stored procedure. I grab an exclusive transaction lock, with #Resource set to a string that is only used inside this stored procedure. The only thing that is ever blocked by this lock is the execution of this stored procedure.
The call inside the stored procedure looks like this:
sp_getapplock #Resource='My Unique String Here'
,#LockMode='Exclusive' -- Type of lock
,#LockOwner='Transaction' -- Transaction or Session
,#LockTimeout = 5000
It works swimmingly. If a dozen instances of my process are running, only one of them executes the stored procedure at any one point in time, while the other 11 obediently queue up and wait their turn.
The only problem is our DBA. He is a very good DBA who constantly monitors the database for blocking and receives an alert when it exceeds a certain threshold. My use of sp_getapplock triggers a lot of alerts. My DBA claims that the blocking in-and-of-itself is a performance problem.
Is his claim accurate? My feeling is that this is "good" blocking, in that the only thing being blocked is execution of the stored procedure, and I want that to be blocked. But my DBA says that forcing SQL Server to enforce this blocking is a significant drain on resources.
Can I tell him to "put down the crack pipe," as we used to say? Or should I re-write my application to avoid the need for sp_getapplock?
The article I read which sold me on sp_getapplock is here: sp_getapplock
Unfortunately, I think your DBA has a point, blocking does drain resources and this type of blocking is putting extra load on the server.
Let me explain how:
Proc gets called, SQL Server assigns worker thread from the Thread pool to it and it starts executing.
Call 2,3,4,... comes in, again SQL Server assigns worker threads to these calls, the Threads starts executing but because of the exclusive locks you have obtained, all the threads get suspended and sitting in the "Waiting List" for resources to become available.
Worker Threads which are very limited in numbers on any SQL Server are being held because of your process.
Now SQL Server is accumulating waits because of something a developer decided to do.
As a DBA we want you to come to SQL Server get what you need and leave it as soon as possible. If you are intentionally staying there and holding on to resources and putting SQL Server under pressure, it will piss off the DBA.
I think you need to reconsider your application design and come up with an alternative solution.
Maybe a "Process Table" in the SQL Server, update it with some value when a process start and for each call check the process table first before you fire the next call for that proc. So the wait stuff happens in the application layer and only when the resources are available then go to DB.
"The stored procedure basically scans a table for a primary key that meets various conditions, marks the record as in-use by the calling process, and then passes the primary key back to the calling process."
Here is a different way to do it inside the SP:
BEGIN TRANSACTION
SELECT x.PKCol
FROM dbo.[myTable] x WITH (FASTFIRSTROW XLOCK ROWLOCK READPAST)
WHERE x.col1 = #col1...
IF ##ROWCOUNT > 0 BEGIN
UPDATE dbo.[myTable]
SET ...
WHERE x.col1 = #col1
END
COMMIT TRANSACTION
XLOCK
Specifies that exclusive locks are to be taken and held until the transaction completes. If specified with ROWLOCK, PAGLOCK, or TABLOCK, the exclusive locks apply to the appropriate level of granularity.
I'm playing with the idea of rerouting every end-user stored procedure call of my database through a logging stored procedure. Essentially it will wrap the stored procedure call in some simple logging logic, who made the call, how long did it take etc.
Can this potentially create a bottleneck? I'm concerned that when the amount of total stored procedure calls grows this could become a serious problem.
Routing everything through a single point of entry is not optimal. Even if there are no performance issues, it is still something of a maintenance problem as you will need to expose the full range of Input Parameters that the real procs are accepting in the controller proc. Adding procs to this controller over time will require a bit of testing each time to make sure that you mapped the parameters correctly. Removing procs over time might leave unused input parameters. This method also requires that the app code pass in the params it needs to for the intended proc, but also the name (or ID?) of the intended proc, and this is another potential source of bugs, even if a minor one.
It would be better to have a general logging proc that gets called as the first thing of each of those procs. That is a standard template that can be added to any new proc quite easily. This leaves a clean API to the app code such that the app code is likewise maintainable.
SQL can run the same stored procedure concurrently, as long as it doesn't cause blocking or deadlocks on the resources it is using. For example:
CREATE PROCEDURE ##test
AS
BEGIN
SELECT 1
WAITFOR DELAY '00:00:10'
SELECT 2
END
Now execute this stored procedure quickly in two different query windows to see it running at the same time:
--Query window 1
EXEC ##test
--Query window 2
EXEC ##test
So you can see there won't be a line of calls waiting to EXECUTE the stored procedure. The only problem you may run into is if you are logging the sproc details to a certain table, depending on the isolation level, you could get blocking as the logging sproc locks pages in the table for recording the data. I don't believe this would be a problem unless you are running the logging stored procedure extremely heavily, but you'd want to run some tests to be sure.
I have a stored procedure, and I want to ensure it cannot be executed concurrently.
My (multi-threaded) application does all necessary work on the underlying table via this stored procedure.
IMO, locking the table itself is an unnecessarily drastic action to take, and so when I found out about sp_GetAppLock, which essentially enforces a critical section, this sounded ideal.
My plan was to encase the stored procedure in a transaction and to set up spGetAppLock with transaction scope. The code was written and tested successfully.
The code has now been put forward for review and I have been told that I should not call this function. However when asking the obvious question "why not?", the only reasons I am getting are highly subjective, to do with any form of locking being complicated.
I don't necessarily buy this, but I was wondering whether anyone had any objective reasons why I should avoid this construct. Like I say, given my circumstances a critical section sounds an ideal approach to me.
Further info: An application sits on top of this with 2 threads T1 and T2. Each thread is waiting for a different message M1 and M2. The business logic involved says that processing can only happen once both M1 and M2 have arrived. The stored procedure logs that Mx has arrived (insert) and then checks whether My is present (select). The built-in locking is fine to make sure the inserts happen serially. But the selects need to happen serially too and I think I need to do something over and above the built-in functionality here.
Just for clarity, I want the "processing" to happen exactly once. So I can't afford for the stored procedure to return either false positives or false negatives. I'm worried that if the stored proc runs twice in very quick succession, then both "selects" might return data which indicates that it is appropriate to perform processing.
What is the procedure doing that you cannot rely on SQL Servers built-in concurrency control mechanisms? Often queries can be rewritten to allow real concurrency.
But if this procedure indeed has to be executed "alone", locking the table itself on first access is most likely going to be a lot faster than using the call to sp_GetAppLock. It sounds like this procedure is going to be called often. If that is the case you should look for a way to achieve the goal with minimal impact.
If the table contains no other rows besides of M1 and M2 a table lock is still your best bet.
If you have multiple threads sending multiple messages you can get more fine-grained by using "serializable" as transaction level and check if the other message is there before you do the insert but within the same transaction. To prevent deadlocks in this case make sure you check for both messages for example like this:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRAN;
SELECT
#hasM1 = MAX(CASE WHEN msg_type='M1' THEN 1 ELSE 0 END),
#hasM2 = MAX(CASE WHEN msg_type='M2' THEN 1 ELSE 0 END)
FROM messages WITH(UPDLOCK)
WHERE msg_type IN ('M1','M2')
INSERT ...
IF(??) EXEC do_other_stuff_and_delete_messages;
COMMIT
In the IF statement before(!) the COMMIT you can use the information collected before the insert together with the information that you inserted to decide if additional processing is necessary.
In that processing step make sure to either mark those messages as processed or to delete them all still within the same transaction. That will make sure that you will not process those messages twice.
SERIALIZABLE is the only transaction isolation level that allows to lock rows that do not exist yet, so the first select statement with the WITH(UPDLOCK) effectively prevents the other row being inserted while the first execution is still running.
Finally, these are a lot of things to be aware of that could go wrong. You might want to have a look at service broker instead. you could use three queues with that. one for type M1 and one for type M2. Every time a message arrives within those queues a procedure can automatically be called to insert a token into the third queue. The third queue then could activate a process to check if both messages exist and do work. That would make the entire process asynchronous but for that it would be easy to restrict the queue 3 response to always only do one check at a time.
Service broker on msdn, also look at "activation" for the automatic message processing.
sp_GetAppLock is just like many other tools and as such it can be misused, overused, or correctly used. It is an exact match for the type of problem described by the original poster.
This is a good MSSQL Tips post on the usage
Prevent multiple users from running the same SQL Server stored procedure at the same time
http://www.mssqltips.com/sqlservertip/3202/prevent-multiple-users-from-running-the-same-sql-server-stored-procedure-at-the-same-time/
We use sp_getapplock all the time, due to the fact that we support some legacy applications that have been re-worked to use a SQL back-end, and the SQL Server locking model is not an exact match for our application logic.
We tend to go for a 'pessimistic' locking model, where we lock an entity before allowing a user to edit it, and use the (NOLOCK) hint extensively when reading data to bypass any blocking from the native locks on the actual tables. sp_getapplock is a good match for this. We also use it to enforce critical paths in large multi-user systems. You have to be systematic about what you call the locks you place.
We've found no performance problems with large numbers of user/locks via this route, so I see no reason why it wouldn't work well for you. Just be aware that you can get blocking and deadlocks if you have processes that place the same named locks, but not necessarily in the same order.
You can create a table with a flag for each set of messages, so if one of the threads is first to start processing it will mark the flag as processing.
To make sure that record blocked properly once one of threads reaches it use:
SELECT ... FROM WITH(XLOCK,ROWLOCK,READCOMMITTED) ... WHERE ...
This peace of code will put Exclusive lock on the record meaning who first got to it owns the row.
Then you do your changes and update flag, other thread will get updated value because it will be blocked by Exclusive lock until first thread commmits or rollbacks transaction.
For this to work you always need to select records from table with XLOCK this way it will work as expected.
Hope this helps.
Exclusive lock prove:
USE master
GO
IF OBJECT_ID('dbo.tblTest') IS NOT NULL
DROP TABLE dbo.tblTest
CREATE TABLE tblTest ( id int PRIMARY KEY )
;WITH cteNumbers AS (
SELECT 1 N
UNION ALL
SELECT N + 1 FROM cteNumbers WHERE N<1000
)
INSERT INTO
tblTest
SELECT
N
FROM
cteNumbers
OPTION (MAXRECURSION 0)
BEGIN TRANSACTION
SELECT * FROM dbo.tblTest WITH(XLOCK,ROWLOCK,READCOMMITTED) WHERE id = 1
SELECT * FROM sys.dm_tran_locks WHERE resource_database_id = DB_ID('master')
ROLLBACK TRANSACTION
"Transaction (Process ID 63) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly."
Could this deadlock be caused by something that stored proc uses like SQL mail? Or is it always caused my something like two applications accessing the same table at the same time?
Two tables accessing the same table at the same time happens all the time in an application. Generally that won't cause a deadlock. A deadlock typically happens when you have say process 'A' attempting to update Table 1 and then Table 2 and then Table 3, and you have process 'B' attempting to update Table 3, then Table 2, and then Table 1. Process 'A' will have a resource locked that process 'B' needs and process 'B' has a resource process 'A' needs. SQL Server detects this as a deadlock and rolls one of the processes back, as a failed transaction.
The bottom line is that you have two processes attempting to update the same tables at the same time, but not in the same order. This will often lead to deadlocks.
One easy way to handle this in your application is to handle the failed transaction and simply re-execute the transaction. It will almost always execute successfully. A better solution is to make sure your processes are updating tables in the same order, as much as possible.
Missing Indexes is another common cause of deadlocks. If a select query can get the info it needs from an index instead of the base table, then it won't be blocked by any updates/inserts on the table itself.
To find out for sure, use the SQL profiler to trace for "Deadlock Graph" events, which will show you the detail of the deadlock itself.
Based on this, I don't think SQL Mail itself would directly be the culprit. I say "directly" because I don't know what you're doing with it. However, I assume SQL Mail is probably slow compared to the rest of your SQL ops, so if you're doing a lot with that, it could indirectly create a bottleneck that leads to a deadlock if you're holding onto tables while sending off the SQL Mail.
It's hard to recommend a specific strategy without having too many specifics about what you're doing. The short of it is that you should consider whether there's a way to break your dependence on holding onto the table while you're doing this such as using NOLOCK, using a temp table or non-temp "holding" table or just refactoring the SP that is doing the call.
I am running a bunch of database migration scripts. I find myself with a rather pressing problem, that business is waking up and expected to see their data, and their data has not finished migrating. I also took the applications offline and they really need to be started back up. In reality "the business" is a number of companies, and therefore I have a number of scripts running SPs in one query window like so:
EXEC [dbo].[MigrateCompanyById] 34
GO
EXEC [dbo].[MigrateCompanyById] 75
GO
EXEC [dbo].[MigrateCompanyById] 12
GO
EXEC [dbo].[MigrateCompanyById] 66
GO
Each SP calls a large number of other sub SPs to migrate all of the data required. I am considering cancelling the query, but I'm not sure at what point the execution will be cancelled. If it cancels nicely at the next GO then I'll be happy. If it cancels mid way through one of the company migrations, then I'm screwed.
If I cannot cancel, could I ALTER the MigrateCompanyById SP and comment all the sub SP calls out? Would that also prevent the next one from running, whilst completing the one that is currently running?
Any thoughts?
One way to acheive a controlled cancellation is to add a table containing a cancel flag. You can set this flag when you want to cancel exceution and your SP's can check this at regular intervals and stop executing if appropriate.
I was forced to cancel the script anyway.
When doing so, I noted that it cancels after the current executing statement, regardless of where it is in the SP execution chain.
Are you bracketing the code within each migration stored proc with transaction handling (BEGIN, COMMIT, etc.)? That would enable you to roll back the changes relatively easily depending on what you're doing within the procs.
One solution I've seen, you have a table with a single record having a bit value of 0 or 1, if that record is 0, your production application disallows access by the user population, enabling you to do whatever you need to and then set that flag to 1 after your task is complete to enable production to continue. This might not be practical given your environment, but can give you assurance that no users will be messing with your data through your app until you decide that it's ready to be messed with.
you can use this method to report execution progress of your script.
the way you have it now is every sproc is it's own transaction. so if you cancel the script you will get it update only partly up to the point of the last successfuly executed sproc.
you cna however put it all in a singel transaction if you need all or nothign update.