Transaction causing stored procedure to hang

Transaction causing stored procedure to hang - sql-server

We have a windows service that orchestrates imports to a database. The service itself isn’t the problem as this essentially just creates a scheduled job, waits until it completes and then fires a series of stored procs. The problem is that one of the procs appears to be getting stuck midway through. It’s not throwing an error and so I have nothing to that I can give as a definitive problem. I have narrowed it down to a single proc that gets called after the job has completed. I’ve even managed to narrow it down to a specific line of code, but that’s where I’m struggling.
The proc, will define a transaction name at the start, being the name of the proc and a datetime. It also gets a transaction count from ##TranCount. It then defines a cursor that loops the files associated with the event. Inside a try block it dynamically creates a view (which definitely happens as I write a log entry afterwards). Immediately after this, there is an IF condition that either creates or saves the transaction based on whether the variable holding ##TranCount is zero or not. Inside this condition I write a message to our log table BEFORE the transaction is created/saved.
Immediately after (regardless of whether it’s a create or a save) I write another log message. The log entry is there. The times we’ve seen this pausing, the proc always writes the create transaction log message. It doesn’t get as far as writing the message outside the condition. The only thing that happens between the first message (pre create/save trans) and the second message (post trans) is the create/save transaction. As the message being logged is the create message, there can’t be a transaction open (##TranCount must have been zero). However, as no error is raised I can’t say with 100% certainty that this is the case. The line that seems to stop is the CREATE TRANSACTION #TransactionName line. This seems to imply that something is locking and preventing the statement from being executed. The problem is we can see no open transactions (DBCC reports nothing open), the proc just hangs there.
We’re fairly certain that it’s a lock of some description, but completely baffled as to what. To add a level of complexity, it doesn’t occur every time. Some times with the same file, we can run the process without any issue on this database. We’ve tried running the file against another database with no luck in replicating the problem, but we have seen it occur on other databases on this server (the server holds multiple client databases that do the same thing). This also only happens on this server. We have other servers in the environment, with seemingly identical configs, where we haven't seen this issue surface.
Unfortunately we can’t post any of the code due to internal rules, but any ideas would be appreciated.

Try using sp_whoisactive and enable the lock flag. I also recommend finding the query plan with the code below and analyzing the stats there.
SELECT * FROM
(
SELECT DB_NAME(p.dbid) AS DBName ,
OBJECT_NAME(p.objectid, p.dbid) AS OBJECT_NAME ,
cp.usecounts ,
p.query_plan ,
q.text ,
p.dbid ,
p.objectid ,
p.number ,
p.encrypted ,
cp.plan_handle
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS q
WHERE cp.cacheobjtype = 'Compiled Plan'
)a WHERE text LIKE '%SNIPPET OF SQL GOES HERE THAT IS PART OF THE QUERY YOU WANT TO FIND%'
ORDER BY dbid, objectID

Have you thought about checking the connection properties for the service? If this is set too low and the proc takes longer to run this will cause it to drop the connection and kill of the process.
This is much more likely than it being anything to do with the name of the transaction.

Related

SQL Error during Lazy Loop awaiting Azure DB resize

I want to automate some DB scaling in my Azure SQL database.
This can be easily initiated using this:
ALTER DATABASE [myDatabase]
MODIFY (EDITION ='Standard', SERVICE_OBJECTIVE = 'S3', MAXSIZE = 250 GB);
But that command returns instantly, whilst the resize takes a few 10s of seconds to complete.
We can check the actual current size using the following, which doesn't update until the change is complete:
SELECT DATABASEPROPERTYEX('myDatabase', 'ServiceObjective')
So naturally I wanted to combine this with a WHILE loop and a WAITFOR DELAY, in order to create a stored procedure that will change the DB size, and not return until the change has completed.
But when I wrote that stored procedure (script below) and ran it, I get the following error every time (at about the same time that the size change completes):
A severe error occurred on the current command. The results, if any, should be discarded.
The resize succeeds, but I get errors instead of a cleanly finishing stored procedure call.
Various things I've already tested:
If I separate the "Initiate" and the "WaitLoop" sections, and start the WaitLoop in a separate connection, after initiation but before completion, then that also gives the same error.
Adding a TRY...CATCH block doesn't help either.
Removing the stored procedure aspect, and just running the code directly doesn't fix it either
My interpretation is that the Resize isn't quite as transparent as one might hope, and that connections created before the resize completes get corrupted in some sense.
Whatever the exact cause, it seems to me that this stored procedure just isn't achievable at all; I'll have to do the polling from my external process - opening new connections each time. It's not an awful solution, but it is less pleasant than being able to encapsulate the whole thing in a single stored procedure. Ah well, such is life.
Question:
Before I give up on this entirely ... does anyone have an alternative explanation or solution for this error, which would thus allow a single stored procedure call to change the size and then not return until that sizeChange actually completed?
Initial stored procedure code (simplified to remove parameterisation complexity):
CREATE PROCEDURE [trusted].[sp_ResizeAzureDbToS3AndWaitForCompletion]
AS
ALTER DATABASE [myDatabase]
MODIFY (EDITION ='Standard', SERVICE_OBJECTIVE = 'S3', MAXSIZE = 250 GB);
WHILE ((SELECT DATABASEPROPERTYEX('myDatabase', 'ServiceObjective')) != 'S3')
BEGIN
WAITFOR DELAY '00:00:05'
END
RETURN 0

Whatever the exact cause, it seems to me that this stored procedure
just isn't achievable at all; I'll have to do the polling from my
external process - opening new connections each time.
Yes this is correct. As described here when you change the service objective of a database
A new compute instance is created with the requested service tier and
compute size... the database remains online during this step, and
connections continue to be directed to the database in the original
compute instance ... [then] existing connections to the database in
the original compute instance are dropped. Any new connections are
established to the database in the new compute instance.
The bolded text will kill your stored procedure execution. You need to do this check externally

Conditional SQL block evaluated even when it won't be executed

I'm working on writing a migration script for a database, and am hoping to make it idempotent, so we can safely run it any number of times without fear of it altering the database (/ migrating data) beyond the first attempt.
Part of this migration involves removing columns from a table, but inserting that data into another table first. To do so, I have something along these lines.
IF EXISTS
(SELECT * FROM sys.columns
WHERE object_id = OBJECT_ID('TableToBeModified')
AND name = 'ColumnToBeDropped')
BEGIN
CREATE TABLE MigrationTable (
Id int,
ColumnToBeDropped varchar
);
INSERT INTO MigrationTable
(Id, ColumnToBeDropped)
SELECT Id, ColumnToBeDropped
FROM TableToBeModified;
END
The first time through, this works fine, since it still exists. However, on subsequent attempts, it fails because the column no longer exists. I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem, or is there another, still potentially "validity enforced" option?

I understand that the entire script is evaluated, and I could instead put the inner contents into an EXEC statement, but is that really the best solution to this problem
Yes. There are several scenarios in which you would want to push off the parsing validation due to dependencies elsewhere in the script. I will even sometimes put things into an EXEC, even if there are no current problems, to ensure that there won't be as either the rest of the script changes or the environment due to addition changes made after the current rollout script was developed. Minorly, it helps break things up visually.
While there can be permissions issues related to breaking ownership changing due to using Dynamic SQL, that is rarely a concern for a rollout script, and not a problem I have ever run into.

If we are not sure that the script will work or not specially migrating database.
However, For query to updated data related change, i will execute script with BEGIN TRAN and check result is expected then we need to perform COMMIT TRAN otherwise ROLLBACK transaction, so it will discard transaction.

Should I avoid using sp_getAppLock?

I have a stored procedure, and I want to ensure it cannot be executed concurrently.
My (multi-threaded) application does all necessary work on the underlying table via this stored procedure.
IMO, locking the table itself is an unnecessarily drastic action to take, and so when I found out about sp_GetAppLock, which essentially enforces a critical section, this sounded ideal.
My plan was to encase the stored procedure in a transaction and to set up spGetAppLock with transaction scope. The code was written and tested successfully.
The code has now been put forward for review and I have been told that I should not call this function. However when asking the obvious question "why not?", the only reasons I am getting are highly subjective, to do with any form of locking being complicated.
I don't necessarily buy this, but I was wondering whether anyone had any objective reasons why I should avoid this construct. Like I say, given my circumstances a critical section sounds an ideal approach to me.
Further info: An application sits on top of this with 2 threads T1 and T2. Each thread is waiting for a different message M1 and M2. The business logic involved says that processing can only happen once both M1 and M2 have arrived. The stored procedure logs that Mx has arrived (insert) and then checks whether My is present (select). The built-in locking is fine to make sure the inserts happen serially. But the selects need to happen serially too and I think I need to do something over and above the built-in functionality here.
Just for clarity, I want the "processing" to happen exactly once. So I can't afford for the stored procedure to return either false positives or false negatives. I'm worried that if the stored proc runs twice in very quick succession, then both "selects" might return data which indicates that it is appropriate to perform processing.

What is the procedure doing that you cannot rely on SQL Servers built-in concurrency control mechanisms? Often queries can be rewritten to allow real concurrency.
But if this procedure indeed has to be executed "alone", locking the table itself on first access is most likely going to be a lot faster than using the call to sp_GetAppLock. It sounds like this procedure is going to be called often. If that is the case you should look for a way to achieve the goal with minimal impact.
If the table contains no other rows besides of M1 and M2 a table lock is still your best bet.
If you have multiple threads sending multiple messages you can get more fine-grained by using "serializable" as transaction level and check if the other message is there before you do the insert but within the same transaction. To prevent deadlocks in this case make sure you check for both messages for example like this:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRAN;
SELECT
#hasM1 = MAX(CASE WHEN msg_type='M1' THEN 1 ELSE 0 END),
#hasM2 = MAX(CASE WHEN msg_type='M2' THEN 1 ELSE 0 END)
FROM messages WITH(UPDLOCK)
WHERE msg_type IN ('M1','M2')
INSERT ...
IF(??) EXEC do_other_stuff_and_delete_messages;
COMMIT
In the IF statement before(!) the COMMIT you can use the information collected before the insert together with the information that you inserted to decide if additional processing is necessary.
In that processing step make sure to either mark those messages as processed or to delete them all still within the same transaction. That will make sure that you will not process those messages twice.
SERIALIZABLE is the only transaction isolation level that allows to lock rows that do not exist yet, so the first select statement with the WITH(UPDLOCK) effectively prevents the other row being inserted while the first execution is still running.
Finally, these are a lot of things to be aware of that could go wrong. You might want to have a look at service broker instead. you could use three queues with that. one for type M1 and one for type M2. Every time a message arrives within those queues a procedure can automatically be called to insert a token into the third queue. The third queue then could activate a process to check if both messages exist and do work. That would make the entire process asynchronous but for that it would be easy to restrict the queue 3 response to always only do one check at a time.
Service broker on msdn, also look at "activation" for the automatic message processing.

sp_GetAppLock is just like many other tools and as such it can be misused, overused, or correctly used. It is an exact match for the type of problem described by the original poster.
This is a good MSSQL Tips post on the usage
Prevent multiple users from running the same SQL Server stored procedure at the same time
http://www.mssqltips.com/sqlservertip/3202/prevent-multiple-users-from-running-the-same-sql-server-stored-procedure-at-the-same-time/

We use sp_getapplock all the time, due to the fact that we support some legacy applications that have been re-worked to use a SQL back-end, and the SQL Server locking model is not an exact match for our application logic.
We tend to go for a 'pessimistic' locking model, where we lock an entity before allowing a user to edit it, and use the (NOLOCK) hint extensively when reading data to bypass any blocking from the native locks on the actual tables. sp_getapplock is a good match for this. We also use it to enforce critical paths in large multi-user systems. You have to be systematic about what you call the locks you place.
We've found no performance problems with large numbers of user/locks via this route, so I see no reason why it wouldn't work well for you. Just be aware that you can get blocking and deadlocks if you have processes that place the same named locks, but not necessarily in the same order.

You can create a table with a flag for each set of messages, so if one of the threads is first to start processing it will mark the flag as processing.
To make sure that record blocked properly once one of threads reaches it use:
SELECT ... FROM WITH(XLOCK,ROWLOCK,READCOMMITTED) ... WHERE ...
This peace of code will put Exclusive lock on the record meaning who first got to it owns the row.
Then you do your changes and update flag, other thread will get updated value because it will be blocked by Exclusive lock until first thread commmits or rollbacks transaction.
For this to work you always need to select records from table with XLOCK this way it will work as expected.
Hope this helps.
Exclusive lock prove:
USE master
GO
IF OBJECT_ID('dbo.tblTest') IS NOT NULL
DROP TABLE dbo.tblTest
CREATE TABLE tblTest ( id int PRIMARY KEY )
;WITH cteNumbers AS (
SELECT 1 N
UNION ALL
SELECT N + 1 FROM cteNumbers WHERE N<1000
)
INSERT INTO
tblTest
SELECT
N
FROM
cteNumbers
OPTION (MAXRECURSION 0)
BEGIN TRANSACTION
SELECT * FROM dbo.tblTest WITH(XLOCK,ROWLOCK,READCOMMITTED) WHERE id = 1
SELECT * FROM sys.dm_tran_locks WHERE resource_database_id = DB_ID('master')
ROLLBACK TRANSACTION

Lost Update Anomaly in Sql Server Update Command

I am very much confused.
I have a transaction in ReadCommitted Isolation level. Among other things I am also updating a counter value in it, something similar to below:
Update tblCount set counter = counter + 1
My application is a desktop application and this transaction happens to occur quite frequently and concurrently. We recently noticed an error that sometimes the counter value doesn't get updated or is missed. We also insert one record on each counter update so we are sure that records have been inserted but somehow counter fails to update. This happens once in 2000 simulaneous transactions.
I seriously doubt it is a lost update anomaly I am facing but if you look at the command above, it's just update the counter from its own value: if I have started a transaction and the transaction has reached this statement, it should have locked the row. This should not cause lost update, but it's happening somehow.
Is the thing that this update command works in two parts? Like first it reads the counter value (during which it doesn't get the exclusive lock) and then writes the new calculated value (when it does get an exclusive lock)?
Please help, I have got really confused.

The update command does not work in two parts. It only works in one.
There's something else going on, and my first guess would be that your transaction is rolling back for another reason. Out of those 2,000 transactions, for example, one may be rolling back - especially if you're doing a ton of things concurrently - and it didn't succeed at all.
That update may not have been what caused the problem, either - you may have deadlocks involved due to other transactions, and they may be failing before the update command (or during the update command).
I'd zoom out and ask questions about the transaction's error handling. Are you doing everything in try/catch blocks? Are you capturing error levels when transactions fail? If not, you'll need to capture a trace with Profiler to find out what's going on.

Are you sure that the SQL is always succeeding? What I mean is, could it be something like an occasional lock time-out? Are you handling SQL exceptions in your .Net code in a way that will be aware of them (i.e a pop-up message or a log entry)?

At what point will a series of selected SQL statements stop if I cancel the execution request in SQL Server Management Studio?

I am running a bunch of database migration scripts. I find myself with a rather pressing problem, that business is waking up and expected to see their data, and their data has not finished migrating. I also took the applications offline and they really need to be started back up. In reality "the business" is a number of companies, and therefore I have a number of scripts running SPs in one query window like so:
EXEC [dbo].[MigrateCompanyById] 34
GO
EXEC [dbo].[MigrateCompanyById] 75
GO
EXEC [dbo].[MigrateCompanyById] 12
GO
EXEC [dbo].[MigrateCompanyById] 66
GO
Each SP calls a large number of other sub SPs to migrate all of the data required. I am considering cancelling the query, but I'm not sure at what point the execution will be cancelled. If it cancels nicely at the next GO then I'll be happy. If it cancels mid way through one of the company migrations, then I'm screwed.
If I cannot cancel, could I ALTER the MigrateCompanyById SP and comment all the sub SP calls out? Would that also prevent the next one from running, whilst completing the one that is currently running?
Any thoughts?

One way to acheive a controlled cancellation is to add a table containing a cancel flag. You can set this flag when you want to cancel exceution and your SP's can check this at regular intervals and stop executing if appropriate.

I was forced to cancel the script anyway.
When doing so, I noted that it cancels after the current executing statement, regardless of where it is in the SP execution chain.

Are you bracketing the code within each migration stored proc with transaction handling (BEGIN, COMMIT, etc.)? That would enable you to roll back the changes relatively easily depending on what you're doing within the procs.
One solution I've seen, you have a table with a single record having a bit value of 0 or 1, if that record is 0, your production application disallows access by the user population, enabling you to do whatever you need to and then set that flag to 1 after your task is complete to enable production to continue. This might not be practical given your environment, but can give you assurance that no users will be messing with your data through your app until you decide that it's ready to be messed with.

you can use this method to report execution progress of your script.
the way you have it now is every sproc is it's own transaction. so if you cancel the script you will get it update only partly up to the point of the last successfuly executed sproc.
you cna however put it all in a singel transaction if you need all or nothign update.