Does TransactionScope really need the "Distributed Transaction Coordinator" service running? - sql-server

I am trying to understand the details of TransactionScope.
Below is quoted from here:
Some procedures in this topic use types in the System.Transactions
assembly. Before you follow these procedures, you must make sure that
the Distributed Transaction Coordinator service is running on the
computer where you run the unit tests. Otherwise, the tests fail, and
the following error message appears: "Test method
ProjectName.TestName.MethodName threw exception:
System.Data.SqlClient.SqlException: MSDTC on server 'ComputerName' is
unavailable".
But strange enough, I stopped that service, and did some DB deletion within a TransactionScope, and didn't call the Complete() method at the end, which means the transaction should rollback.
The DB is not affected indeed. It seems the transaction still works well.
As I understand, we need resource manager (RM) and transaction manager (TM) to make the transaction on resources happen. In my scenario, the Distributed Transaction coordinator service is stopped, who is the transaction manager then?

When you use TransactionScope, you are working with an ambient transaction, and the transaction is independently managed within your code.
The TransactionScope class is defined by msdn as:
Makes a code block transactional. This class cannot be inherited.
...
Upon instantiating a TransactionScope by the new statement, the
transaction manager determines which transaction to participate in.
Once determined, the scope always participates in that transaction.
The decision is based on two factors: whether an ambient transaction
is present and the value of the TransactionScopeOption parameter in
the constructor. The ambient transaction is the transaction your code
executes in. You can obtain a reference to the ambient transaction by
calling the static Current property of the Transaction class.
Also from msdn:
The TransactionScope class provides a simple way to mark a block of
code as participating in a transaction, without requiring you to
interact with the transaction itself. A transaction scope can select
and manage the ambient transaction automatically.
Also from msdn:
A TransactionScope object has three options:
Join the ambient transaction, or create a new one if one does not exist.
Be a new root scope, that is, start a new transaction and have that transaction be the new ambient transaction inside its own scope.
Not take part in a transaction at all. There is no ambient transaction as a result.

The DTC service is only needed when the transaction is escalated. See more on this here: Transaction Management Escalation
Escalation can be difficult to determine beforehand as by design, this is pretty automatic wich is cool but sometimes unexpected. But, basically, if you're running a transaction on a single SQL Server (not SQL 2000 I believe you need SQL 2005 at least or escalation always happens, see this link: TransactionScope: transaction escalation behavior) instance (one "Resource Manager" / RM), there are good chances that escalation will be avoided. And in general, it's a good thing because it can be costy in terms of performance.

Related

Connection hanged using transaction suspension with Websphere datasource

While using websphere datasource to connect SQL-SERVER spring transaction getting hanged (Locked) with REQUIRES_NEW or NOT_SUPPORTED propogations.
This is not the case when using BasicDataSource or any other.
Please help.
REQUIRES_NEW and NOT_SUPPORTED are both transaction attributes that would prevent running the operation in the current transaction (which would be suspended and then resumed afterwards). The WebSphere Application Server data source is aware of container managed transactions and will enlist in them. I'm not sure what BasicDataSource is, but if it is not aware of container transactions, that could explain why you see a difference in behavior.
The flow to cause a single-thread deadlock could be something like this:
transaction begin
SQL command that locks row X
invoke REQUIRES_NEW method
implies transaction suspend, new transaction begin
SQL command that attempts to lock row X <-- deadlock here
If this is what you are seeing, it is operating as designed, per spec, and you should consider using a different transaction attribute, such as SUPPORTS, if you want the operation to run in the same transaction.

Is it a correct behaviour of transactions?

I have three transaction services that are executed within a transaction boundary (stratTransaction or begin transaction). All three services uses different connection (No_Transaction, Local_Transaction and XA_Transaction) for their processing respectively. Now I want to know, when I start a transaction (using javax.transaction.TransactionManager) and run these three services within the transaction boundary, I can see that the service that used NO and LOCAL transactions are able to insert data into the tables. Now I am inserting data more than the table constraints in a column using the Service XA (and I know it is supposed to fail) and calling the commit (and a rollback procedure if there are any failures). Now I have data in tables of NO and Local connection tables while XA connection table don't have any data. Now:
I want to know that when the transaction has failed at one point it is suppossed to rollback all the data from all the tables or it is just supposed to rollback data of XA Service only?
I also wanted to know: 'Transaction' as I know is a procedure of transferring data atomicly. So why connection creation includes defining the type of transaction that can be performed by connection isn't it a property of transactions?
I also want to know that why we have to define transaction type in connection properties instead we must define the type of transaction when we start a transacion and that transaction manager must perform the given type of transactions.
Thanks in advance.
Let's start with the simplest transaction mode and increase complexity.
No transaction
A 'no transaction' connection is one that does not 'commit' or 'roll back' data such as sending email. Once you have passed the message object to the email server, it is sent to the recipient and no amount of pleading will ever get the message back again. It's almost as if every call is committed by the time the call returns. Examples of this kind of connection include connection to SMTP, SMS gateways, printers and so on.
I believe that you can use a database connection in this manner if you have auto-commit on, but it begs the question on why you have a full ACID database in the first place...
'Normal' transactions
The normal connection, for example to a SQL database, has the ability to store up a series of state change commands in an internal buffer. When everything has been done, and all appears OK, then the whole buffer of changes is written to the data store and other connections can see the changes. If something goes wrong, before or even during the commit, the whole set of changes can be discarded (rolled back).
One critical limitation of this type of connection is the scope of the buffer - the buffer is part of the connection itself. In other words, it is only through the connection that you can write to the buffer.
An important responsibility of the application server is to manage these connections. When you ask the connection pool to give you the connection, you magically get the same connection each time (within a single transaction). This is true even when when one EJB calls another or when an EJB calls into a Resource Adapter (assuming you use the REQUIRES_TRANSACTION semantics. You can override this with REQUIRES_NEW). This behaviour means that one web request can multiple EJB calls, each of which can interact with multiple entity beans, and all the data manipulation occurs on a single connection with a single internal buffer. It will all be committed or rolled back together.
Transactions with multiple connections
This is great when you have a single database - but (by definition) you need separate connections if you talk to separate database instances (eg on different machines). So what happens then? Your EJB transaction ends up associated with multiple connections - each connection to a unique database. This appears to work well, except in one situation:
You have Connection A to Database A and Connection B to Database B
You execute DML statements on A and B
You commit the EJB connection. The Application Server now:
Commits Connection A - success
Commits Connection B - fail (eg constraint fails) and Connection B rolls back
This is a disaster - you have committed the transaction in Database A, and this cannot now be rolled back. However, the transaction (and the whole EJB) is rolled back on Database B.
(Interestingly, your example is almost identical to this - you have data committed to the no transaction and normal transaction, but not in the XA transaction - the last of the three connections)
XA Transactions
This is where XA comes in. It provides logic to co-ordinate transactions being committed against different data sources and simulates a single transaction over multiple data sources. XA commits with a "two-phase commit" managed by a transaction co-ordinator that manages a number of XA-connections co-opted into the XA Transaction. The co-ordinator
Sends a message to each data source through the XA Connection to see if the transaction can be committed: All constraint and database logic is executed up to the point just before a final commit. If any database reports a failure, the XA co-ordinator rolls back the whole transaction. Phase 1 is where almost all the transaction work is carried out and so takes comparatively long
When every database has reported that the transaction can be committed, the co-ordinator sends a message to every database to commit the transaction. This happens very fast.
Note that the two-phase commit can fail if something goes wrong in phase 2 (eg part of the network crashes or one of the databases is powered off between phase 1 and phase 2).
Because an XA connection behaves so differently from a normal connection, it typically needs a different ConnectionFactory object which instantiates different object instances than a non-XA ConnectionFactory. In addition, the XA ConnectionFactory needs configuration parameters for the XA transaction co-ordinator, such as XA transaction timeouts, which are in addition to the ordinary transaction properties.
Another constraint: Only Connections created by an XA ConnectionFactory can join an XA Transaction and the associated two-phase commit. You can have both XA and non-XA connections participating in a single Application Server transaction, but then the entire transaction cannot reliably commit/rollback as a single transaction (as above).
Specific answers
I want to know that when the transaction has failed at one point it is suppossed to rollback all the data from all the tables or it is just supposed to rollback data of XA Service only?
If the transaction fails before the application server attempts a commit (eg your EJB gets a NPE or you deliberately roll back), each connection will receive a rollback, and everything should be just as you expect.
However, if the transaction fails in the commit logic (eg a database constraint), then the transaction manager will attempt to roll everything back; this cannot happen if a non-XA connection has already committed.
I also wanted to know: 'Transaction' as I know is a procedure of transferring data atomicly. So why connection creation includes defining the type of transaction that can be performed by connection isn't it a property of transactions?
The XA connection uses a different library and protocol than the ordinary connection, because the connection itself needs to communicate with the XA Transaction Co-ordinator. Normal connections don't do this.
I also want to know that why we have to define transaction type in connection properties instead we must define the type of transaction when we start a transacion and that transaction manager must perform the given type of transactions.
Because the XA connection uses different code, the connection pool needs to load a different class when compared to the normal connection. This is why the connection pool (not connection) properties are different.
yes if a transaction is failed to write its commit entry in log file then it rollbacks completely(Atomic property of trxn ).
Trxn is an atomic unit of database processing.Whatever opr you perform in database using txn , that action will be atomic.
By default the transaction is of autocommit type. but if you use your own code for stating the start point and commit point of a txn then it is of explicit type.(http://msdn.microsoft.com/en-us/library/ms172353.aspx)

Should I wrap every db call in transaction?

I've written TransactionContext class that is instantiated in application layer and is send down to business and data layer, allowing nested transactions. Now I must decide:
1. Should I use explicit transactions and let every function to call begin, commit or rollback on the transaction if needed?
2. I can start the transaction implicitly when TransactionContext is created and let nested methods only to rollback
Now, I would use second approach because it's easier to code: no worry about begin, commit or rollback in every method, just set the rollback flag on transaction and let only top most method worry about commit or rollback. Problem is that I'm not sure if wrapping all database traffic in transaction is a good idea.
What are possible negative effects with wrapping all database calls inside the transaction?
My setup is ASP.NET appliaction and MSSQL Server database. It is possible that appliaction and database will be on different servers, if that's something that influence the decision.
Single SQL statements are already wrapped in an implicit transaction. Use a transaction where it is needed, i.e. when updating multiple tables in an atomic operation. Wrapping all calls to the DB is not a good idea: it might lead to reduced throughput and blocking.
Altough SQL Server supports nested Transactions, they might not work as you expect:
Committing inner transactions is
ignored by the SQL Server Database
Engine. The transaction is either
committed or rolled back based on the
action taken at the end of the
outermost transaction. If the outer
transaction is committed, the inner
nested transactions are also
committed. If the outer transaction is
rolled back, then all inner
transactions are also rolled back,
regardless of whether or not the inner
transactions were individually
committed.
Ref.: Nesting Transactions

SQL Server transactions / concurrency confusion - must you always use table hints?

When you create a SQL Server transaction, what gets locked if you do not specify a table hint? In order to lock something, must you always use a table hint? Can you lock rows/tables outside of transactions (i.e. in ordinary queries)? I understand the concept of locking and why you'd want to use it, I'm just not sure about how to implement it in SQL Server, any advice appreciated.
You should use query hints only as a last resort, and even then only after expert analysis. In some cases, they will cause a query to perform badly. So, unless you really know what you are doing, avoid using query hints.
Locking (of various types) happens automatically everytime you perform a query (unless NOLOCK is specified). The default Transaction Isolation level is READ COMMITTED
What are you actually trying to do?
Understanding Locking in SQL Server
"Can you lock rows/tables outside of transactions (i.e. in ordinary queries)?"
You'd better understand that there are no ordinary queries or actions in SQL Server, they are ALL, without any exceptions, transactional. This is how ACID-ness is achieved, see, for ex., [1]. If client tools or developer interactively do not specify transaction explicitly with BEGIN TRANSACTION and COMMIT/ROLLBACK, then implicit transactions are used.
Also, transaction is not synonym of locking/locks engagement. There is a plethora of mechanisms to control concurrency without locking (for example, versioning. etc.) as well as READ UNCOMMITTED transaction "isolation" (in this case, absence of any isolation) level does not control it at all.
Update2:
In order to lock something, must you always use a table hint?
As far as, transaction isolation level is not READ UNCOMMITTED or one of row-versioning (snapshot) isolation levels, for ex., default READ COMMITTED or set by, for ex.,
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
the locks are issued (I do not know where to start and how to end this topic[2]). Table hints, which can be used in statements, override these settings.
[1]
Paul S. Randal. Understanding Logging and Recovery in SQL Server
What is Logging?
http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx#id0060003
[2]
Insert trailing ) upon clicking this link
http://en.wikipedia.org/wiki/Isolation_(database_systems)
Update:
5 min ago I had reputation 784 (the same as 24h ago) and, now, without any visible downvotes, it dropped to 779.
Where can I ask this question if I am banned from meta.stackoverflow.com?

If I access UserTransaction does this mean that I use 2 phase commit or XA?

UserTransaction ut=lookup....
ut.beginTransaction();
saveToFooDB();
statelessEjb.transactionSupportedMethod(); //saves something to the Foo DB
saveToFooDB();
ut.commit();
If i was doing the above then my understanding is that it is not an XA transaction as it doesn't span across multiple resources (like DB plus JMS). Is my understanding correct?
Data source can be configured of two kinds:
XA: these datasource can participate in distribute transactions
Local: also called non-XA, they can not participate in a distributed transaction
The UserTransaction is defined in the JTA specification which describe how to coordinate the participant in a distributed transaction.
The application server which implements the JTA specification is however free to do a lot of optimizations. One of them is the last-agent-optimization, which allows the last participant in the distributed transaction to be Local. A regular commit is then done for the last participants. If there is only one participant then it's always the case.
In short:
if you have more than one participant, XA and 2 phase commit need to be used
if there is only one participant, most application server support local data source and do not use the full-blow 2 phase commit protocol.
For Glassfish see:
last-agent-optimization
configure JDBC data source
EDIT
Paragraph "transaction scope" of glassfish documentation explains it better than me. I guess it's the same for all application server.
A local transaction involves only one
non-XA resource and requires that all
participating application components
execute within one process. Local
transaction optimization is specific
to the resource manager and is
transparent to the Java EE
application.
In the Application Server, a JDBC
resource is non-XA if it meets any of
the following criteria:
In the JDBC connection pool configuration, the DataSource class
does not implement the
javax.sql.XADataSource interface.
The Global Transaction Support box is not checked, or the Resource
Type setting does not exist or is not
set to javax.sql.XADataSource.
A transaction remains local if the
following conditions remain true:
One and only one non-XA resource is used. If any additional non-XA
resource is used, the transaction is
aborted.
No transaction importing or exporting occurs.
Transactions that involve multiple
resources or multiple participant
processes are distributed or global
transactions. A global transaction can
involve one non-XA resource if last
agent optimization is enabled.
Otherwise, all resourced must be XA.
The use-last-agent-optimization
property is set to true by default.
For details about how to set this
property, see Configuring the
Transaction Service.
If only one XA resource is used in a
transaction, one-phase commit occurs,
otherwise the transaction is
coordinated with a two-phase commit
protocol.
Once you start the UserTransaction, and then obtain a connection to the resource (eg databases) using a connection-factory which is declared to be xa-supportive, it means that connection will become part of the XA transaction. Also, it does not matter at all whether you are connecting to single or multiple types of resources like JMS and database.
Hope that helps.
Nitin

Resources