I am using CONTEXT_INFO to pass a username to a delete trigger for the purposes of an audit/history table. I'm trying to understand the scope of CONTEXT_INFO and if I am creating a potential race condition.
Each of my database tables has a stored proc to handle deletes. The delete stored proc takes userId as an parameter, and sets CONTEXT_INFO to the userId. My delete trigger then grabs the CONTEXT_INFO and uses that to update an audit table that indicates who deleted the row(s).
The question is, if two deletes sprocs from different users are executing at the same time, can CONTEXT_INFO set in one of the sprocs be consumed by the trigger fired by the other sproc?
I've seen this article http://msdn.microsoft.com/en-us/library/ms189252.aspx but I'm not clear on the scope of sessions and batches in SQL Server which is key to the article being helpful!
I'd post code, but short on time at the moment. I'll edit later if this isn't clear enough.
Thanks in advance for any help.
Context info has no scope (in the sense of language variables scope) and is bound to the session lifetime. Once set, the context info stay at the value set until the connection is closed (the session terminates) or until a new value is set. Since execution on a session is always sequential, there is no question of concurrency.
IF you set the context info in a procedure, any trigger subsequently executed on that session will see the newly set context info value. Setting the user id value in the context info, as you propose, and using it in triggers is the typical example of the context info use and is perfectly safe in regard to concurrency, since basically there is no concurrency to speak of. If you plan to set the context info in a stored procedure and then rely on it in a trigger that runs due to deletes that occur in the said procedure, then your batch did not finish yet so, according to the article you linked, you retrieve the conetxt info from the sys.dm_exec_requests DMV or from the CONTEXT_INFO() function. It will not yet be pushed in sys.dm_exec_sessions, that can only happen after you exit the stored procedure and finish any other call in the T-SQL batch sent to the server (the 'request').
I've used this exact method for auditing at one client site and they've been using it heavily for close to 6 months now with no problems.
The context information is scoped to the current connection for the current batch and any batches that start after the current batch has completed. Two users in your environment would either not be on the same connection, or if there is connection sharing they would still have their own values if they overlapped at all. If one came after the other then the second one would overwrite the first, but it would have been done with it by then anyway. At least this is my understanding of how it works. You can look up MARS (Multiple Active Result Sets) for more information on it.
Related
SET STATISTICS TIME statement is only useful while developing as with it one can performance tune additional statement being added to the query or UDF/SP being worked on. However when one has to performance tune existing code, e.g. a SP with hundreds or thousands of lines of code, the output of this statement is pretty totally useless as it is not clear which to which SQL-statement the recorded times belong to.
Isn't there any alternatives to SET STATISTICS TIME which also show the Statements to which the recorded times belong to?
I would recommend to use advanced tool. Here is example of one call of sp with all and every internal details. On the right you have different runs history which can be commented and analyzed later. All you need for stats/index usage/io/waits - everything available on different tabs. Util: SentryOne Plan Explorer (free).
If your Stored Procedures are granular then you could use this DMV to get an idea of times.
SELECT
DB_NAME(qs.database_id) AS DBName
,qs.database_id
,qs.object_id
,OBJECT_NAME(qs.object_id,qs.database_id) AS ObjectName
,qs.cached_time
,qs.last_execution_time
,qs.plan_handle
,qs.execution_count
,total_worker_time
,last_worker_time
,min_worker_time
,max_worker_time
,total_physical_reads
,last_physical_reads
,min_physical_reads
,max_physical_reads
,total_logical_writes
,last_logical_writes
,min_logical_writes
,max_logical_writes
,total_logical_reads
,last_logical_reads
,min_logical_reads
,max_logical_reads
,total_elapsed_time
,last_elapsed_time
,min_elapsed_time
,max_elapsed_time
FROM
sys.dm_exec_procedure_stats qs
I'd create an extended events session similar to the one below:
CREATE EVENT SESSION [proc_statments] ON SERVER
ADD EVENT sqlserver.module_end(
WHERE ([object_name]=N'usp_foobar')
),
ADD EVENT sqlserver.sp_statement_completed(
SET collect_object_name=(1),collect_statement=(1)
WHERE ([object_name]=N'usp_foobar'))
ADD TARGET package0.event_file(SET filename=N'proc_statments')
WITH (TRACK_CAUSALITY=ON)
GO
This tracks both stored procedure and stored procedure statement completion for a procedure called usp_foobar. Within the event itself, there's an identifier that helps you tie together which statements were executed as a result of having executed a specific procedure (that's what the TRACK_CAUSALITY is for).
I do not have an issue in test but I am wondering about a live environment with thousands of simultaneous users.
If I execute two queries separately:
insert into table (somecolumnname) values (somedata)
followed by
select ##IDENTITY as lastInsertId
This will get me the last insert ID to be used in other queries later on. The documentation on ##IDENTITY (http://msdn.microsoft.com/en-gb/library/aa933167(v=sql.80).aspx) says:
The scope of the ##IDENTITY function is the local server on which it is executed.
If this is local server, if someone comes in and inserts another statement in the time it takes me to execute the second query then I will get the wrong insertID; even if it is within a transaction?
Secondly, I read a conflicting account here (http://msdn.microsoft.com/en-gb/library/ms187342.aspx):
The scope of the ##IDENTITY function is current session on the local server on which it is executed.
Therefore if ##IDENTITY is session based what exactly is a session when connecting using the native client and connection pooling? From my understanding of connection pooling, if a second user accesses the site while the first user is accessing my site; the nativec client will use the same connection instead of opening a second separate connection. If this is the case is this the same as sharing the session when it comes to ##IDENTITY meaning that inserts from other users can still affect the results?
It does sound like a transaction should ensure I get the right value from ##IDENTITY but I am finding it hard to find clear documentation telling me that this is the case and of course this is hard to test in a test environment.
I've tried executing both of the queries together using an absolute cursor:
inset into table (somecolumnname) values (somedata);select ##IDENTITY as lastindertid
But the problem is this is running through an abstraction function which appends the select ##identity query to the end to set a dirty global to the last insert id (i can't change this easily or I would). As there is no sqlsrv_last_result function there is no way for me to know if I am on the last result or not without moving the cursor too far and checking if it is null; but again this would require me to pull data from every query in a try:catch (to prevent errors pulling data from deletes for example) and then just use the last pull before the cursor returned null... this is of course extremely slow.
If someone has managed to test this or can point to somewhere where the definitions are not conflicting that would be much appreciated.
Thanks,
I have figured out that the native client (unlike DBLIB) default to setting NOCOUNT off meaning that INSERTs return the number of affected rows as result sets which was causing me the original issue.
Immediately running "SET NOCOUNT ON" followed by using SCOPE_IDENTITY() within the same query batch seems to have solved the issue by not requiring me to run two separate queries in order to locate the select portion.
In the future more inserts will use stored procedures instead.
I have a long-running SP (it can run for up to several minutes) that basically performs a number of cleanup operations on various tables within a transaction. I'm trying to determine the best way to somehow pass human-readable status information back to the caller on what step of the process the SP is currently performing.
Because the entire SP runs inside a single transaction, I can't write this information back to a status table and then read it from another thread unless I use NOLOCK to read it, which I consider a last resort since:
NOLOCK can cause other data inconsistency issues; and
this places the onus on anyone wanting to read the status table that they need to use NOLOCK because the table or row(s) could be locked for quite a while.
Is there any way to issue a single command (or EXEC a second SP) within a transaction and tell specify that that particular command shouldn't be part of the transaction? Or is there some other way for ADO.NET to gain insight into this long-running SP to see what it is currently doing?
You can PRINT messages in T-SQL and get them delivered to your SqlConnection in ADO.NET via the "InfoMessage" event. See
http://msdn.microsoft.com/en-us/library/a0hee08w.aspx
for details.
You could try using RAISERROR (use a severity of 10 or lower) within the procedure to return informational messages.
Example:
RAISERROR(N'Step 5 completed.', 10, 1) WITH NOWAIT;
I'm building my own clone of http://statoverflow.com/sandbox (using the free controls provided to 10K users from Telerik). I have a proof of concept available I can use locally, but before I open it up to others I need to lock it down some more. Currently I run everything through a stored procedure that looks something like this:
CREATE PROCEDURE WebQuery
#QueryText nvarchar(1000)
AS
BEGIN
-- no writes, so no need to lock on select
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
-- throttles
SET ROWCOUNT 500
SET QUERY_GOVERNOR_COST_LIMIT 500
exec (#QueryText)
END
I need to do two things yet:
Replace QUERY_GOVERNOR_COST_LIMIT with an actual rather than estimated timeout, so no query runs longer than say 2 minutes.
Right now nothing stops users from just putting their own 'SET ROWCOUNT 50000;' in front of their query text to override my restriction, so I need to somehow limit the queries to a single statement or (preferrably) disallow the SET commands inside the exec function.
Any ideas?
You really plan to allow users to run arbitrary Ad-Hoc SQL? Only then can a user place in a SET to override your restrictions. If that's the case, you're best bet is to do some basic parsing using lexx/yacc or flex/bison (or your favorite CLR language tree parser) and detect invalid SET statements. Are you going to allow SET #variable=value though, which syntactically is a SET...
If you impersonate low privileged users via EXECUTE AS make sure you create an irreversible impersonation context, so the user does not simply execute REVERT and regain all the privileges :) You also must really understand the implications of database impersonation, make sure you read Extending Database Impersonation by Using EXECUTE AS.
Another thing to consider is deffering execution of requests to a queue. Since queue readers can be calibrated via MAX_QUEUE_READERS, you get a very cheap throttling. See Asynchronous procedure execution for a related article how to use queues to execute batches. This mechanism is different from resource governance, but I've seen it used to more effect that the governor itself.
Throwing this out there:
The EXEC statement appears to support impersonation. See http://msdn.microsoft.com/en-us/library/ms188332.aspx. Perhaps you can impersonate a limited user. I am looking into the availability of limitations that may prevent SET statements and the like.
On a very basic level, how about blocking any statement that doesn't start with SELECT? Or will other query starts be supported, like CTE's or DECLARE statements? 1000 chars isn't too much room to play with, but i'm not too clear what this is in the first place.
UPDATED
Ok, how about prefixing whatever they submit with SELECT TOP 500 FROM (
and appending a ). If they try to do multiple statements it'll throw an error you can catch. And to prevent denial of service, replace their starting SELECT with another SELECT TOP 500.
Doesn't help if they've appended an ORDER BY to something returning a million rows, though.
I am running a bunch of database migration scripts. I find myself with a rather pressing problem, that business is waking up and expected to see their data, and their data has not finished migrating. I also took the applications offline and they really need to be started back up. In reality "the business" is a number of companies, and therefore I have a number of scripts running SPs in one query window like so:
EXEC [dbo].[MigrateCompanyById] 34
GO
EXEC [dbo].[MigrateCompanyById] 75
GO
EXEC [dbo].[MigrateCompanyById] 12
GO
EXEC [dbo].[MigrateCompanyById] 66
GO
Each SP calls a large number of other sub SPs to migrate all of the data required. I am considering cancelling the query, but I'm not sure at what point the execution will be cancelled. If it cancels nicely at the next GO then I'll be happy. If it cancels mid way through one of the company migrations, then I'm screwed.
If I cannot cancel, could I ALTER the MigrateCompanyById SP and comment all the sub SP calls out? Would that also prevent the next one from running, whilst completing the one that is currently running?
Any thoughts?
One way to acheive a controlled cancellation is to add a table containing a cancel flag. You can set this flag when you want to cancel exceution and your SP's can check this at regular intervals and stop executing if appropriate.
I was forced to cancel the script anyway.
When doing so, I noted that it cancels after the current executing statement, regardless of where it is in the SP execution chain.
Are you bracketing the code within each migration stored proc with transaction handling (BEGIN, COMMIT, etc.)? That would enable you to roll back the changes relatively easily depending on what you're doing within the procs.
One solution I've seen, you have a table with a single record having a bit value of 0 or 1, if that record is 0, your production application disallows access by the user population, enabling you to do whatever you need to and then set that flag to 1 after your task is complete to enable production to continue. This might not be practical given your environment, but can give you assurance that no users will be messing with your data through your app until you decide that it's ready to be messed with.
you can use this method to report execution progress of your script.
the way you have it now is every sproc is it's own transaction. so if you cancel the script you will get it update only partly up to the point of the last successfuly executed sproc.
you cna however put it all in a singel transaction if you need all or nothign update.