At what point will a series of selected SQL statements stop if I cancel the execution request in SQL Server Management Studio? - sql-server

I am running a bunch of database migration scripts. I find myself with a rather pressing problem, that business is waking up and expected to see their data, and their data has not finished migrating. I also took the applications offline and they really need to be started back up. In reality "the business" is a number of companies, and therefore I have a number of scripts running SPs in one query window like so:
EXEC [dbo].[MigrateCompanyById] 34
GO
EXEC [dbo].[MigrateCompanyById] 75
GO
EXEC [dbo].[MigrateCompanyById] 12
GO
EXEC [dbo].[MigrateCompanyById] 66
GO
Each SP calls a large number of other sub SPs to migrate all of the data required. I am considering cancelling the query, but I'm not sure at what point the execution will be cancelled. If it cancels nicely at the next GO then I'll be happy. If it cancels mid way through one of the company migrations, then I'm screwed.
If I cannot cancel, could I ALTER the MigrateCompanyById SP and comment all the sub SP calls out? Would that also prevent the next one from running, whilst completing the one that is currently running?
Any thoughts?

One way to acheive a controlled cancellation is to add a table containing a cancel flag. You can set this flag when you want to cancel exceution and your SP's can check this at regular intervals and stop executing if appropriate.

I was forced to cancel the script anyway.
When doing so, I noted that it cancels after the current executing statement, regardless of where it is in the SP execution chain.

Are you bracketing the code within each migration stored proc with transaction handling (BEGIN, COMMIT, etc.)? That would enable you to roll back the changes relatively easily depending on what you're doing within the procs.
One solution I've seen, you have a table with a single record having a bit value of 0 or 1, if that record is 0, your production application disallows access by the user population, enabling you to do whatever you need to and then set that flag to 1 after your task is complete to enable production to continue. This might not be practical given your environment, but can give you assurance that no users will be messing with your data through your app until you decide that it's ready to be messed with.

you can use this method to report execution progress of your script.
the way you have it now is every sproc is it's own transaction. so if you cancel the script you will get it update only partly up to the point of the last successfuly executed sproc.
you cna however put it all in a singel transaction if you need all or nothign update.

Related

SQL Error during Lazy Loop awaiting Azure DB resize

I want to automate some DB scaling in my Azure SQL database.
This can be easily initiated using this:
ALTER DATABASE [myDatabase]
MODIFY (EDITION ='Standard', SERVICE_OBJECTIVE = 'S3', MAXSIZE = 250 GB);
But that command returns instantly, whilst the resize takes a few 10s of seconds to complete.
We can check the actual current size using the following, which doesn't update until the change is complete:
SELECT DATABASEPROPERTYEX('myDatabase', 'ServiceObjective')
So naturally I wanted to combine this with a WHILE loop and a WAITFOR DELAY, in order to create a stored procedure that will change the DB size, and not return until the change has completed.
But when I wrote that stored procedure (script below) and ran it, I get the following error every time (at about the same time that the size change completes):
A severe error occurred on the current command. The results, if any, should be discarded.
The resize succeeds, but I get errors instead of a cleanly finishing stored procedure call.
Various things I've already tested:
If I separate the "Initiate" and the "WaitLoop" sections, and start the WaitLoop in a separate connection, after initiation but before completion, then that also gives the same error.
Adding a TRY...CATCH block doesn't help either.
Removing the stored procedure aspect, and just running the code directly doesn't fix it either
My interpretation is that the Resize isn't quite as transparent as one might hope, and that connections created before the resize completes get corrupted in some sense.
Whatever the exact cause, it seems to me that this stored procedure just isn't achievable at all; I'll have to do the polling from my external process - opening new connections each time. It's not an awful solution, but it is less pleasant than being able to encapsulate the whole thing in a single stored procedure. Ah well, such is life.
Question:
Before I give up on this entirely ... does anyone have an alternative explanation or solution for this error, which would thus allow a single stored procedure call to change the size and then not return until that sizeChange actually completed?
Initial stored procedure code (simplified to remove parameterisation complexity):
CREATE PROCEDURE [trusted].[sp_ResizeAzureDbToS3AndWaitForCompletion]
AS
ALTER DATABASE [myDatabase]
MODIFY (EDITION ='Standard', SERVICE_OBJECTIVE = 'S3', MAXSIZE = 250 GB);
WHILE ((SELECT DATABASEPROPERTYEX('myDatabase', 'ServiceObjective')) != 'S3')
BEGIN
WAITFOR DELAY '00:00:05'
END
RETURN 0
Whatever the exact cause, it seems to me that this stored procedure
just isn't achievable at all; I'll have to do the polling from my
external process - opening new connections each time.
Yes this is correct. As described here when you change the service objective of a database
A new compute instance is created with the requested service tier and
compute size... the database remains online during this step, and
connections continue to be directed to the database in the original
compute instance ... [then] existing connections to the database in
the original compute instance are dropped. Any new connections are
established to the database in the new compute instance.
The bolded text will kill your stored procedure execution. You need to do this check externally

Is there an alternative to SET STATISTICS TIME which also shows the statements?

SET STATISTICS TIME statement is only useful while developing as with it one can performance tune additional statement being added to the query or UDF/SP being worked on. However when one has to performance tune existing code, e.g. a SP with hundreds or thousands of lines of code, the output of this statement is pretty totally useless as it is not clear which to which SQL-statement the recorded times belong to.
Isn't there any alternatives to SET STATISTICS TIME which also show the Statements to which the recorded times belong to?
I would recommend to use advanced tool. Here is example of one call of sp with all and every internal details. On the right you have different runs history which can be commented and analyzed later. All you need for stats/index usage/io/waits - everything available on different tabs. Util: SentryOne Plan Explorer (free).
If your Stored Procedures are granular then you could use this DMV to get an idea of times.
SELECT
DB_NAME(qs.database_id) AS DBName
,qs.database_id
,qs.object_id
,OBJECT_NAME(qs.object_id,qs.database_id) AS ObjectName
,qs.cached_time
,qs.last_execution_time
,qs.plan_handle
,qs.execution_count
,total_worker_time
,last_worker_time
,min_worker_time
,max_worker_time
,total_physical_reads
,last_physical_reads
,min_physical_reads
,max_physical_reads
,total_logical_writes
,last_logical_writes
,min_logical_writes
,max_logical_writes
,total_logical_reads
,last_logical_reads
,min_logical_reads
,max_logical_reads
,total_elapsed_time
,last_elapsed_time
,min_elapsed_time
,max_elapsed_time
FROM
sys.dm_exec_procedure_stats qs
I'd create an extended events session similar to the one below:
CREATE EVENT SESSION [proc_statments] ON SERVER
ADD EVENT sqlserver.module_end(
WHERE ([object_name]=N'usp_foobar')
),
ADD EVENT sqlserver.sp_statement_completed(
SET collect_object_name=(1),collect_statement=(1)
WHERE ([object_name]=N'usp_foobar'))
ADD TARGET package0.event_file(SET filename=N'proc_statments')
WITH (TRACK_CAUSALITY=ON)
GO
This tracks both stored procedure and stored procedure statement completion for a procedure called usp_foobar. Within the event itself, there's an identifier that helps you tie together which statements were executed as a result of having executed a specific procedure (that's what the TRACK_CAUSALITY is for).

Launch stored procedure and continue running it even if disconnected

I have a database where data is processed in some kind of batches, where each batch may contain even a million records. I am processing data in a console application, and when I'm done with a batch, I mark it as Done (to avoid reading it again in case it does not get deleted), delete it and move on to a next batch.
I have the following simple stored procedure which deletes processed "batches" of data
CREATE PROCEDURE [dbo].[DeleteBatch]
(
#BatchId bigint
)
AS
SET XACT_ABORT ON
BEGIN TRANSACTION
DELETE FROM table1 WHERE BatchId = #BatchId
DELETE FROM table2 WHERE BatchId = #BatchId
DELETE FROM table3 WHERE BatchId = #BatchId
COMMIT
RETURN ##Error
I am using NHibernate with command timeout value 10 minutes, and the DeleteBatch procedure call times out occasionally.
Actually I don't want to wait for DeleteBatch to complete. I already have marked the batch as Done, so I want to go processing a next batch or maybe even exit my console application, if there are no more pending batches.
I am using Microsoft SQL Express 2012.
Is there any simple solution to tell the SQL server - "launch DeleteBatch and run it asynchronously even if I disconnect, and I don't even need the result of the procedure"?
It would also be great if I could set a lower processing priority for DeleteBatch because other queries are more important than DeleteBatch.
I dont know much about NHibernate. But if you were or can use ADO.NET in this scenario then you can implement asynchronous database operations easliy using the SqlCommand.BeginExecuteNonQuery Method in C#. This method starts the process of asynchronously executing a Transact-SQL statement or stored procedure that does not return rows, so that other tasks can run concurrently while the statement is executing.
EDIT: If you really want to exit from your console app before the db operation ends then you will have to manually create threads in your code and perform the db operation in those threads. Now when you close your console app these threads would still be alive because Threads created using System.Thread.Thread are foreground threads by default. But having said that it is also important to consider how many threads you will create. In your case you would have to assign 1 thread for each batch. If number of batches is very large then large number of threads would need to be created which would inturn eat a large amount of your CPU resources and would even freeze your OS for a long time.
Another simple solution I could suggest is to insert the BatchIds into some database table. Create an INSERT TRIGGER on that table. This trigger would then call a stored proc with BatchId as its parameter and would perform the required tasks.
Hope it helps.
What if your console application were, instead of trying to delete the batch, just write the batch id into a "BatchIdsToDelete" table. Then, you could use an agent job running every x minutes/seconds or whatever, to delete the top x percent records for a given batch id, and maybe sleeping a little before tackling the next x percent.
Maybe worth having a look at that?
Look at this article which explains how to do reliable asynchronous procedure execution, code included. IS based on Service Broker.
the problem with trying to use .NEt async features (like BeginExecute, or task etc) is that the call is unreliable: if the process exits before the procedure completes the execution is canceled in the server as the session is disconnected.
But you need to also look at the task itself, why is the deletion taking +10 minutes? is it blocked by contention? are you missing indexes on BatchId? Use the Performance Troubleshooting Flowchart.
Late to the party, but if someone else has this problem use SQLCMD. With express you are limited in the number of users (I think 2, but it may have changed since I the last time I did much with express). You can have sqlcmd, run queries, stored procedures ...
And you can kick off the sqlcmd with Windows Scheduler. A script, an outlook rule ...
I used it to manage like 3 or 4 thousand SQL Server Express instances, with their nightly maintenance scheduled with the Windows Scheduler.
You could also create and run a PowerShell script, it's more versatile and probably a more widely used than sqlcmd.
I needed a same thing..
After searching for long time I found the solution
Its d easiest way
SqlConnection connection = new SqlConnection();
connection.ConnectionString = "your connection string";
SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder(connection.ConnectionString);
builder.AsynchronousProcessing = true;
SqlConnection newSqlConn = new SqlConnection(builder.ConnectionString);
newSqlConn.Open();
SqlCommand cmd = new SqlCommand(storeProcedureName, newSqlConn);
cmd.CommandType = CommandType.StoredProcedure;
cmd.BeginExecuteNonQuery(null, null);
Ideally SQLConnection object should take an optional parameter / property, URL of a web service, be that WCF or WebApi, or something yet to be named, and if the user wishes to, notify user of execution advance and / or completion status by calling this URL with well known message.
Theoretically DBConnection is extensible object one is free to implement. However, it will take some review of what really can be and needs to be done, before this approach can be said feasible.

What is the best approach of breaking long running stored procedures?

What is the best approach of breaking long running stored procedures (up to 20 minutes)?
Inside the Stored procedure is wrapped in a transaction. If I close connection will this transaction be rolled back?
Another approach is to start a transaction in C# before I start the stored procedure and when I want to cancel the stored procedure I just need to rollback the C# transaction.
If you close the connection, SQL Server will rollback the transaction if it notices the disconnect before the transaction commits. There'll be a (very) small time window where the transaction might complete just when you disconnect.
A custom transaction adds complexity and has few benefits for a single stored procedure call. So I'd go for the disconnect.
You can set a time out in three place-- the connection, the command, and if you are ultimately programming a web page-- in the page time out.
The relevant time out in this case is the command time out.
Update: Cancelling by a user's event:
To cancel your command by a user event, call Cancel() on the command. I haven't written code to test this, but I suspect that once you call ExecuteReader() it will block, so you'd need an async call-- BeginExecuteNonQuery(), which really is a pain to set up-- it requires extra things on the query string and I think it requires SQL2005+
UPDATE: Re: Transactions
C# (or ADO.NET) transaction code adds about two lines of code and guarantees invocations of stored procedures (which may have more than one statement in them, in not today, maybe a year from now) succeed or fail as a unit. They normally are not a source of poor performance and can be a source of poor performance when not used-- e.g. a long series of inserts runs faster in a transaction.
If you do not call CommitTrans() the transaction will rollback, you do not have to explicitly call Rollback()
Set Timeout at the time of beginning the transaction.

Enforce query restrictions

I'm building my own clone of http://statoverflow.com/sandbox (using the free controls provided to 10K users from Telerik). I have a proof of concept available I can use locally, but before I open it up to others I need to lock it down some more. Currently I run everything through a stored procedure that looks something like this:
CREATE PROCEDURE WebQuery
#QueryText nvarchar(1000)
AS
BEGIN
-- no writes, so no need to lock on select
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
-- throttles
SET ROWCOUNT 500
SET QUERY_GOVERNOR_COST_LIMIT 500
exec (#QueryText)
END
I need to do two things yet:
Replace QUERY_GOVERNOR_COST_LIMIT with an actual rather than estimated timeout, so no query runs longer than say 2 minutes.
Right now nothing stops users from just putting their own 'SET ROWCOUNT 50000;' in front of their query text to override my restriction, so I need to somehow limit the queries to a single statement or (preferrably) disallow the SET commands inside the exec function.
Any ideas?
You really plan to allow users to run arbitrary Ad-Hoc SQL? Only then can a user place in a SET to override your restrictions. If that's the case, you're best bet is to do some basic parsing using lexx/yacc or flex/bison (or your favorite CLR language tree parser) and detect invalid SET statements. Are you going to allow SET #variable=value though, which syntactically is a SET...
If you impersonate low privileged users via EXECUTE AS make sure you create an irreversible impersonation context, so the user does not simply execute REVERT and regain all the privileges :) You also must really understand the implications of database impersonation, make sure you read Extending Database Impersonation by Using EXECUTE AS.
Another thing to consider is deffering execution of requests to a queue. Since queue readers can be calibrated via MAX_QUEUE_READERS, you get a very cheap throttling. See Asynchronous procedure execution for a related article how to use queues to execute batches. This mechanism is different from resource governance, but I've seen it used to more effect that the governor itself.
Throwing this out there:
The EXEC statement appears to support impersonation. See http://msdn.microsoft.com/en-us/library/ms188332.aspx. Perhaps you can impersonate a limited user. I am looking into the availability of limitations that may prevent SET statements and the like.
On a very basic level, how about blocking any statement that doesn't start with SELECT? Or will other query starts be supported, like CTE's or DECLARE statements? 1000 chars isn't too much room to play with, but i'm not too clear what this is in the first place.
UPDATED
Ok, how about prefixing whatever they submit with SELECT TOP 500 FROM (
and appending a ). If they try to do multiple statements it'll throw an error you can catch. And to prevent denial of service, replace their starting SELECT with another SELECT TOP 500.
Doesn't help if they've appended an ORDER BY to something returning a million rows, though.

Resources