Should I wrap every db call in transaction? - database

I've written TransactionContext class that is instantiated in application layer and is send down to business and data layer, allowing nested transactions. Now I must decide:
1. Should I use explicit transactions and let every function to call begin, commit or rollback on the transaction if needed?
2. I can start the transaction implicitly when TransactionContext is created and let nested methods only to rollback
Now, I would use second approach because it's easier to code: no worry about begin, commit or rollback in every method, just set the rollback flag on transaction and let only top most method worry about commit or rollback. Problem is that I'm not sure if wrapping all database traffic in transaction is a good idea.
What are possible negative effects with wrapping all database calls inside the transaction?
My setup is ASP.NET appliaction and MSSQL Server database. It is possible that appliaction and database will be on different servers, if that's something that influence the decision.

Single SQL statements are already wrapped in an implicit transaction. Use a transaction where it is needed, i.e. when updating multiple tables in an atomic operation. Wrapping all calls to the DB is not a good idea: it might lead to reduced throughput and blocking.
Altough SQL Server supports nested Transactions, they might not work as you expect:
Committing inner transactions is
ignored by the SQL Server Database
Engine. The transaction is either
committed or rolled back based on the
action taken at the end of the
outermost transaction. If the outer
transaction is committed, the inner
nested transactions are also
committed. If the outer transaction is
rolled back, then all inner
transactions are also rolled back,
regardless of whether or not the inner
transactions were individually
committed.
Ref.: Nesting Transactions

Related

Where to put BEGIN TRAN, in the code or in the stored procedure?

Let's say I have a stored procedure that is doing 3 inserts. To make sure everything is working fine, I add a begin and commit tran in the stored procedure.
Then from the code side (.NET, C#), the programmer is also creating a transaction.
What will be the best approach for that?
Having in both places?
Having that in the C# code only?
Having that in the stored procedure only?
It's better to only do it in the stored procedure for a number of reasons:
The procedure can keep better control of when to begin and commit the transaction.
It can also control the isolation level, which should usually be set before the transaction starts.
It keeps database code close to the database (somewhat subjective).
If the connection is severed and the server does not realize, a transaction opened by the client may not be committed or rolled back for some time, and could leave locks hanging, causing a huge chain of blocking
The client starting and committing the transaction requires two extra round-trips over the network, which in a low-latency app may be problematic. (SET NOCOUNT ON should be used for the same reason.) The transaction and associated locking is also extended for that time, casuing further blocking problems.
Do use SET XACT_ABORT ON, in case of exceptions this will cause an automatic rollback and prevent them from leaving hanging transactions.
It may still may sense to use client-side transactions, especially with distributed transactions.
Having transactions in both client code and the procedure is silly and wasteful. Choose one or the other option and stick to it.

Payment Transactions vs Database Transactions

So in payments processing, you have this concept of payment transaction. As payments or messages come in from various internal and external interfaces, payment transactions get created, probably in some main base transaction table in some relational database (at least for the sake of this question). Then the transactions have some state that changes through a series of edits until one of several final states (paid or unpaid, approved or declined, etc) is reached.
When dealing with databases, you of course have the database transaction, and my question is really, are there any rules of thumb about processing payments transactions within database transactions? The transaction is often an aggregate root to many other tables for information on the customer or cardholder or merchant or velocity settings that participate in that transaction.
I could see a rule saying, "never process more than one payment transaction in a database transaction". But I could also see a database transaction being correct when performing batch type operations, when you must consider the whole batch of transactions successful or failed, so you have the option of rollback.
The normal design pattern is to stick to the following rule: Use database transactions to transition the database from one valid state to another.
In the context of payment transactions that probably means that adding a transaction is one db transaction. Then, each processing step on the transaction (like validation, fulfillment, ...) would be another db transaction.
I could see a rule saying, "never process more than one payment transaction in a database transaction"
You can put multiple logical transactions into one physical transactions for performance reasons or for architectural reasons. That's not necessarily a problem. You need to make sure that work is not lost, though, because a failure will abort an entire batch.

Locking single row / row level when inserting data via TransactionScope, Entity Framework and SQL Server

I did some research and haven't found any explanation to this. So, If there's one out there, I'm sorry.
I am using TransactionScope to handle transactions with my SQL Server 2012 database. I'm also using Entity Framework.
The fact is that when I start a new transaction to insert a new record into my table, it locks the entire table, not just that row. So, if I run the Db.SaveChanges(), without committing it, and go to management studio and try to get the the already committed data from the same table, it hangs and return me no data.
What I would like in this scenario is to lock just the new row, not the entire table.
Is that possible?
Thank you in advance.
One thing to be very careful of when using TransactionScope is that it uses Serializable isolation level by default which can cause many locking issues in SQL Server. The default isolation level in SQL Server is Read Committed, so you should consider using that in any transactions that use TransactionScope. You can factor out a method that creates your default TransactionScope and always set to ReadCommitted by default (see Why is System.Transactions TransactionScope default Isolationlevel Serializable). Also ensure that you have a using block when using TransactionScope, to make sure that if errors occur with the transaction processing that the transaction is rolled back (http://msdn.microsoft.com/en-us/library/yh598w02.aspx).
By default, SQL Server uses a pessimistic concurrency model, which means that as DML commands are being processed (inserts, updates, deletes), it will acquire an exclusive lock on the data that is changing, which will prevent other updates or SELECTs from completing until those locks are released. The only way to release those locks is to commit or rollback the transaction. So if you have a transaction that is inserting data into a table, and you run a SELECT * FROM myTable before the insert has completed, then SQL Server will force your select to wait until the open transaction has been commit or rolled back before returning the results. Normally transactions should be small and fast, and you would not notice as much of an issue. Here is more info on isolation levels and locking (http://technet.microsoft.com/en-us/library/ms378149.aspx).
In your case, it sounds like you are debugging, and have hit a breakpoint in the code with the transaction open. For debugging purposes, you can add a nolock hint to your query, which would show the results of the data that has been committed, along with the insert which has not yet been committed. Because using nolock will return UN-committed data, be very careful about using this in any production environment. Here is an example of a query with a nolock hint.
SELECT * FROM myTable WITH(NOLOCK)
If you continue to run into locking issues outside of debugging, then you can also check out snapshot isolation (Great article by Kendra Little: http://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/). There are some special considerations when using snapshot isolation, such as tempdb tuning.

Does TransactionScope really need the "Distributed Transaction Coordinator" service running?

I am trying to understand the details of TransactionScope.
Below is quoted from here:
Some procedures in this topic use types in the System.Transactions
assembly. Before you follow these procedures, you must make sure that
the Distributed Transaction Coordinator service is running on the
computer where you run the unit tests. Otherwise, the tests fail, and
the following error message appears: "Test method
ProjectName.TestName.MethodName threw exception:
System.Data.SqlClient.SqlException: MSDTC on server 'ComputerName' is
unavailable".
But strange enough, I stopped that service, and did some DB deletion within a TransactionScope, and didn't call the Complete() method at the end, which means the transaction should rollback.
The DB is not affected indeed. It seems the transaction still works well.
As I understand, we need resource manager (RM) and transaction manager (TM) to make the transaction on resources happen. In my scenario, the Distributed Transaction coordinator service is stopped, who is the transaction manager then?
When you use TransactionScope, you are working with an ambient transaction, and the transaction is independently managed within your code.
The TransactionScope class is defined by msdn as:
Makes a code block transactional. This class cannot be inherited.
...
Upon instantiating a TransactionScope by the new statement, the
transaction manager determines which transaction to participate in.
Once determined, the scope always participates in that transaction.
The decision is based on two factors: whether an ambient transaction
is present and the value of the TransactionScopeOption parameter in
the constructor. The ambient transaction is the transaction your code
executes in. You can obtain a reference to the ambient transaction by
calling the static Current property of the Transaction class.
Also from msdn:
The TransactionScope class provides a simple way to mark a block of
code as participating in a transaction, without requiring you to
interact with the transaction itself. A transaction scope can select
and manage the ambient transaction automatically.
Also from msdn:
A TransactionScope object has three options:
Join the ambient transaction, or create a new one if one does not exist.
Be a new root scope, that is, start a new transaction and have that transaction be the new ambient transaction inside its own scope.
Not take part in a transaction at all. There is no ambient transaction as a result.
The DTC service is only needed when the transaction is escalated. See more on this here: Transaction Management Escalation
Escalation can be difficult to determine beforehand as by design, this is pretty automatic wich is cool but sometimes unexpected. But, basically, if you're running a transaction on a single SQL Server (not SQL 2000 I believe you need SQL 2005 at least or escalation always happens, see this link: TransactionScope: transaction escalation behavior) instance (one "Resource Manager" / RM), there are good chances that escalation will be avoided. And in general, it's a good thing because it can be costy in terms of performance.

Is it a correct behaviour of transactions?

I have three transaction services that are executed within a transaction boundary (stratTransaction or begin transaction). All three services uses different connection (No_Transaction, Local_Transaction and XA_Transaction) for their processing respectively. Now I want to know, when I start a transaction (using javax.transaction.TransactionManager) and run these three services within the transaction boundary, I can see that the service that used NO and LOCAL transactions are able to insert data into the tables. Now I am inserting data more than the table constraints in a column using the Service XA (and I know it is supposed to fail) and calling the commit (and a rollback procedure if there are any failures). Now I have data in tables of NO and Local connection tables while XA connection table don't have any data. Now:
I want to know that when the transaction has failed at one point it is suppossed to rollback all the data from all the tables or it is just supposed to rollback data of XA Service only?
I also wanted to know: 'Transaction' as I know is a procedure of transferring data atomicly. So why connection creation includes defining the type of transaction that can be performed by connection isn't it a property of transactions?
I also want to know that why we have to define transaction type in connection properties instead we must define the type of transaction when we start a transacion and that transaction manager must perform the given type of transactions.
Thanks in advance.
Let's start with the simplest transaction mode and increase complexity.
No transaction
A 'no transaction' connection is one that does not 'commit' or 'roll back' data such as sending email. Once you have passed the message object to the email server, it is sent to the recipient and no amount of pleading will ever get the message back again. It's almost as if every call is committed by the time the call returns. Examples of this kind of connection include connection to SMTP, SMS gateways, printers and so on.
I believe that you can use a database connection in this manner if you have auto-commit on, but it begs the question on why you have a full ACID database in the first place...
'Normal' transactions
The normal connection, for example to a SQL database, has the ability to store up a series of state change commands in an internal buffer. When everything has been done, and all appears OK, then the whole buffer of changes is written to the data store and other connections can see the changes. If something goes wrong, before or even during the commit, the whole set of changes can be discarded (rolled back).
One critical limitation of this type of connection is the scope of the buffer - the buffer is part of the connection itself. In other words, it is only through the connection that you can write to the buffer.
An important responsibility of the application server is to manage these connections. When you ask the connection pool to give you the connection, you magically get the same connection each time (within a single transaction). This is true even when when one EJB calls another or when an EJB calls into a Resource Adapter (assuming you use the REQUIRES_TRANSACTION semantics. You can override this with REQUIRES_NEW). This behaviour means that one web request can multiple EJB calls, each of which can interact with multiple entity beans, and all the data manipulation occurs on a single connection with a single internal buffer. It will all be committed or rolled back together.
Transactions with multiple connections
This is great when you have a single database - but (by definition) you need separate connections if you talk to separate database instances (eg on different machines). So what happens then? Your EJB transaction ends up associated with multiple connections - each connection to a unique database. This appears to work well, except in one situation:
You have Connection A to Database A and Connection B to Database B
You execute DML statements on A and B
You commit the EJB connection. The Application Server now:
Commits Connection A - success
Commits Connection B - fail (eg constraint fails) and Connection B rolls back
This is a disaster - you have committed the transaction in Database A, and this cannot now be rolled back. However, the transaction (and the whole EJB) is rolled back on Database B.
(Interestingly, your example is almost identical to this - you have data committed to the no transaction and normal transaction, but not in the XA transaction - the last of the three connections)
XA Transactions
This is where XA comes in. It provides logic to co-ordinate transactions being committed against different data sources and simulates a single transaction over multiple data sources. XA commits with a "two-phase commit" managed by a transaction co-ordinator that manages a number of XA-connections co-opted into the XA Transaction. The co-ordinator
Sends a message to each data source through the XA Connection to see if the transaction can be committed: All constraint and database logic is executed up to the point just before a final commit. If any database reports a failure, the XA co-ordinator rolls back the whole transaction. Phase 1 is where almost all the transaction work is carried out and so takes comparatively long
When every database has reported that the transaction can be committed, the co-ordinator sends a message to every database to commit the transaction. This happens very fast.
Note that the two-phase commit can fail if something goes wrong in phase 2 (eg part of the network crashes or one of the databases is powered off between phase 1 and phase 2).
Because an XA connection behaves so differently from a normal connection, it typically needs a different ConnectionFactory object which instantiates different object instances than a non-XA ConnectionFactory. In addition, the XA ConnectionFactory needs configuration parameters for the XA transaction co-ordinator, such as XA transaction timeouts, which are in addition to the ordinary transaction properties.
Another constraint: Only Connections created by an XA ConnectionFactory can join an XA Transaction and the associated two-phase commit. You can have both XA and non-XA connections participating in a single Application Server transaction, but then the entire transaction cannot reliably commit/rollback as a single transaction (as above).
Specific answers
I want to know that when the transaction has failed at one point it is suppossed to rollback all the data from all the tables or it is just supposed to rollback data of XA Service only?
If the transaction fails before the application server attempts a commit (eg your EJB gets a NPE or you deliberately roll back), each connection will receive a rollback, and everything should be just as you expect.
However, if the transaction fails in the commit logic (eg a database constraint), then the transaction manager will attempt to roll everything back; this cannot happen if a non-XA connection has already committed.
I also wanted to know: 'Transaction' as I know is a procedure of transferring data atomicly. So why connection creation includes defining the type of transaction that can be performed by connection isn't it a property of transactions?
The XA connection uses a different library and protocol than the ordinary connection, because the connection itself needs to communicate with the XA Transaction Co-ordinator. Normal connections don't do this.
I also want to know that why we have to define transaction type in connection properties instead we must define the type of transaction when we start a transacion and that transaction manager must perform the given type of transactions.
Because the XA connection uses different code, the connection pool needs to load a different class when compared to the normal connection. This is why the connection pool (not connection) properties are different.
yes if a transaction is failed to write its commit entry in log file then it rollbacks completely(Atomic property of trxn ).
Trxn is an atomic unit of database processing.Whatever opr you perform in database using txn , that action will be atomic.
By default the transaction is of autocommit type. but if you use your own code for stating the start point and commit point of a txn then it is of explicit type.(http://msdn.microsoft.com/en-us/library/ms172353.aspx)

Resources