I have been looking for definitive documentation regarding the isolation level ( or concurrency or scope ... I'm not sure EXACTLY what to call it) of triggers in SQL Server.
I have found the following sources which indicate that what I believe is true (which is to say that two users, executing updates to the same table --even the same rows-- will then have independent and isolated triggers executed):
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/601977fb-306c-4888-a72b-3fbab6af0cdc/effects-of-concurrent-trigger-firing-on-inserted-and-deleted-tables?forum=transactsql
https://social.msdn.microsoft.com/forums/sqlserver/en-US/b78c3e7b-6b98-48e1-ad43-3c773c79a6ff/trigger-and-inserted-table
The first question is essentially the same question I am trying to find the answer to, but the answer given doesn't provide any sources. The second question also hits near the mark, and the answer is the same, but again, no sources are provided.
Can someone point me to where the available documentation makes the same assertions?
Thanks!
Well, Isolation Level and Scope are two very different things.
Isolation Level
Triggers operate within a transaction. By default, that transaction should be using the default isolation level of READ COMMITTED. However, if the calling process has specified a different isolation level, then that would override the default. As per usual: if desired, you should be able to override that within the trigger itself.
According to the MSDN page for DML Triggers:
The trigger and the statement that fires it are treated as a single transaction, which can be rolled back from within the trigger. If a severe error is detected (for example, insufficient disk space), the entire transaction automatically rolls back.
Scope
The context provided is:
{from you}
two users, executing updates to the same table --even the same rows
{from the first linked MSDN article in the Question that is "essentially the same question I am trying to find the answer to"}
Are the inserted and deleted tables scoped to the current session? In other words will they only contain the inserted and deleted records for the current scope, or will they contain the records for all current update operations against the same table? Can there even be truely concurrent operations or will locks prevent this?
Before getting into the inserted and deleted tables it should be made very clear that there will only ever be a single DML operation happening on a particular row at any given moment. Two or more requests might come in at the exact same nanosecond, but all requests will take their turn, one at a time (and yes, due to locking).
Now, regarding what is in the inserted and deleted tables: Yes, only the rows for that particular event will be (and even can be) in those two pseudo-tables. If you execute an UPDATE that will modify 5 rows, only those 5 rows will be in the inserted and deleted tables. And since you are looking for documentation, the MSDN page for Use the inserted and deleted Tables states:
The deleted table stores copies of the affected rows during DELETE and UPDATE statements. During the execution of a DELETE or UPDATE statement, rows are deleted from the trigger table and transferred to the deleted table. The deleted table and the trigger table ordinarily have no rows in common.
The inserted table stores copies of the affected rows during INSERT and UPDATE statements. During an insert or update transaction, new rows are added to both the inserted table and the trigger table. The rows in the inserted table are copies of the new rows in the trigger table.
Tying this back to the other part of the question, the part relating to the Transaction Isolation Level: The Transaction Isolation Level has absolutely no effect on the inserted and deleted tables as they pertain specifically to that event/query. However, the net effect of that operation, which is captured in those two psuedo-tables, can still be visible to other processes if they are using the READ UNCOMMITTED Isolation Level or the NOLOCK table hint.
And just to clarify something, the MSDN page linked above regarding the inserted and deleted tables states at the very beginning that they are "in memory" but that is not exactly correct. Starting in SQL Server 2005, those two pseudo-tables are actually based in tempdb. The MSDN page for the tempdb Database states:
The tempdb system database is a global resource that is available to all users connected to the instance of SQL Server and is used to hold the following:
...
Row versions that are generated by data modification transactions for features, such as: online index operations, Multiple Active Result Sets (MARS), and AFTER triggers.
Prior to SQL Server 2005, the inserted and deleted tables were read from the Transaction Log (I believe).
To summarize, the inserted and deleted tables:
operate within a Transaction
are static (i.e. read-only) tables
are visible to only the current Trigger
only contain rows for the specific event/operation/query that fired that instance of that Trigger
Related
If an ETL process attempts to detect data changes on system-versioned tables in SQL Server by including rows as defined by a rowversion column to be within a rowversion "delta window", e.g.:
where row_version >= #previous_etl_cycle_rowversion
and row_version < #current_etl_cycle_rowversion
.. and the values for #previous_etl_cycle_rowversion and #current_etl_cycle_rowversion are selected from a logging table whose newest rowversion gets appended to said logging table at the start of each ETL cycle via:
insert into etl_cycle_logged_rowversion_marker (cycle_start_row_version)
select ##DBTS
... is it possible that a rowversion of a record falling within a given "delta window" (bounded by the 2 ##DBTS values) could be missed/skipped due to rowversion's behavior vis-à-vis transactional consistency? - i.e., is it possible that rowversion would be reflected on a basis of "eventual" consistency?
I'm thinking of a case where say, 1000 records are updated within a single transaction and somehow ##DBTS is "ahead" of the record's committed rowversion yet that specific version of the record is not yet readable...
(For the sake of scoping the question, please exclude any cases of deleted records or immediately consecutive updates on a given record within such a large batch transaction.)
If you make sure to avoid row versioning for the queries that read the change windows you shouldn't miss many rows. With READ COMMITTED SNAPSHOT or SNAPSHOT ISOLATION an updated but uncommitted row would not appear in your query.
But you can also miss rows that got updated after you query ##dbts. That's not such a big deal usually as they'll be in the next window. But if you have a row that is constantly updated you may miss it for a long time.
But why use rowversion? If these are temporal tables you can query the history table directly. And Change Tracking is better and easier than using rowversion, as it tracks deletes and optionally column changes. The feature was literally built for to replace the need to do this manually which:
usually involved a lot of work and frequently involved using a
combination of triggers, timestamp columns, new tables to store
tracking information, and custom cleanup processes
.
Under SNAPSHOT isolation, it turns out the proper function to inspect rowversion which will ensure contiguous delta windows while not skipping rowversion values attached to long-running transactions is MIN_ACTIVE_ROWVERSION() rather than ##DBTS.
I am updating a column in a SQL table and I want to check if it was updated successfully or it was updated already and my query didn't do anything
as we get ##rowcount in SQL Server.
In my case, I want to update a column named lockForProcessing, so if it is already processing, then my query would not affect any row, it means someone else is already processing it, else I would process it.
If I understand you correctly, your problem is related to a multi threading / concurrency problem, where the same table may be updated simultaneously.
You may want to have a look at the :
Chapter 11. Transactions And Concurrency
The ISession is not threadsafe!
The entity is not stored the moment the code session.SaveOrUpdate() is executed, but typically after transaction.Commit().
stored and commited are two different things.
The entity is stored after any session.Flush(). Depending on the IsolationLevel, the entity won't be seen by other transactions.
The entity is commited after a transaction.Commit(). A commit also flushes.
Maybe all you need to do is choose the right IsolationLevel when beginning transactions and then read the table row to get the current value:
using (var transaction = session.BeginTransaction(IsolationLevel.Serializable))
{
session.Get(); // Read your row
transaction.Commit();
}
Maybe it is easier to create some locking or pipeline mechanism in your application code though. Without knowing more about who is accessing the database (other transactions, sessions, processes?) it is hard to answer more precisely.
Is there a way (using config + transaction isolation levels) to ensure that there are no interim holes in a SQL Server IDENTITY column? Persistent holes are OK. The situation I am trying to avoid is when one query returns a hole but a subsequent similar query returns a row that was not yet committed when the query had been run the first time.
Your question is one of isolation levels and has nothing to do with IDENTITY. The same problem applies to any update/insert visibility. The first query can return results which had include an uncommited row in one and only one situation: if you use dirty reads (read uncommited). If you do, then you deserve all the inconsistent results you'll get and you deserve no help.
If you want to see stable results between two consecutive reads you must have a transaction that encompases both reads and use SERIALIZABLE isolation level or, better, use a row versioning based isolation level like SNAPSHOT. My recommendation would be to enable SNAPSHOT and use it. See Using Snapshot Isolation.
All I need is the promise that inserts to a table are committed in order of identity values they claim.
I hope you read this again and realize the impossibility of the request ('promise ... commit..'). You can't ask for something to guarantee success before it finished. What you're asking eventually boils down to asking not to allocate a new identity before the previous allocated one has committed successfully. In other words, full serialization of all insert transactions.
Most databases support some form of "insert into select..." statement.
insert into a
select value from b;
How is this being achieved?
My understanding: The rows that are present at that point of time when the statement starts execution qualify to be picked up, and they are inserted into table a. At the same-time new values can be inserted into table b and they would not be "considered" since the query has already started execution.
Is my understanding close to being accurate? Any reference docs on this greatly appreciated.
Thanks!
The answer for most modern databases is multiversion concurrency control.
Basically each row has a timestamp from what instant it is visible. The select then considers the isolation level to see if rows added by transactions that have committed before the current statement (for read committed isolation) or before the current transaction (for serializable isolation) should be visible to the select.
Since you are not talking about any engine in particular, that could be happening. Also there could be a point where the database just pick a row at the time.. It's all depends on the engine, and the locks applied to the database.
"New values can be inserted" depending on your isolation level; for example if it is serializable that will not happen.
I guess there are database specific difference, but I can provide a general answer for most of them.
When performing a "insert as select", the RDBMS would go and execute the SELECT statement. Like any other SELECT statements, the results would be stored in a "virtual table" in the memory (each database and its own cache and RAM management). Then, the INSERT statement becomes a normal multi-rows INSERT statement, as the results in the memory behaves exactly like data which would be provided via the command line.
At this stage, if any new row would be inserted to the "selected" table, it will not affect the INSERT statement.
Finally, if the SELECT yields too many rows as a result, or would refer to a locked table, things could change, as the RDBMS would select the values differently.
SQL Server selectivity (makes use of indexes which you'll want to look at as well)
-http://blog.namwarrizvi.com/?p=157
-http://www.sqlsolutions.com/articles/articles/How_Values_with_Irregular_Selectivity_Impact_SQL_Server_Database_Performance.htm
-http://sqlserverpedia.com/blog/sql-server-bloggers/index-columns-selectivity-and-inequality-predicates/
Oracle selectivity (again these articles refer to index selectivity)
-http://www.akadia.com/services/ora_index_selectivity.html
-http://courses.csusm.edu/cs643yo/slides/optimization.htm (talks about architecture, might be more useful for you here)
I've got in an ASP.NET application this process :
Start a connection
Start a transaction
Insert into a table "LoadData" a lot of values with the SqlBulkCopy class with a column that contains a specific LoadId.
Call a stored procedure that :
read the table "LoadData" for the specific LoadId.
For each line does a lot of calculations which implies reading dozens of tables and write the results into a temporary (#temp) table (process that last several minutes).
Deletes the lines in "LoadDate" for the specific LoadId.
Once everything is done, write the result in the result table.
Commit transaction or rollback if something fails.
My problem is that if I have 2 users that start the process, the second one will have to wait that the previous has finished (because the insert seems to put an exclusive lock on the table) and my application sometimes falls in timeout (and the users are not happy to wait :) ).
I'm looking for a way to be able to have the users that does everything in parallel as there is no interaction, except the last one: writing the result. I think that what is blocking me is the inserts / deletes in the "LoadData" table.
I checked the other transaction isolation levels but it seems that nothing could help me.
What would be perfect would be to be able to remove the exclusive lock on the "LoadData" table (is it possible to force SqlServer to only lock rows and not table ?) when the Insert is finished, but without ending the transaction.
Any suggestion?
Look up SET TRANSACTION ISOLATION LEVEL READ COMMITTED SNAPSHOT in Books OnLine.
Transactions should cover small and fast-executing pieces of SQL / code. They have a tendancy to be implemented differently on different platforms. They will lock tables and then expand the lock as the modifications grow thus locking out the other users from querying or updating the same row / page / table.
Why not forget the transaction, and handle processing errors in another way? Is your data integrity truely being secured by the transaction, or can you do without it?
if you're sure that there is no issue with cioncurrent operations except the last part, why not start the transaction just before those last statements, Whichever they are that DO require isolation), and commit immediately after they succeed.. Then all the upfront read operations will not block each other...